Open lofcz opened 2 weeks ago
We don't ship ML models by design
What's the deal with running LLMs locally via Chromium?
I dislike having random binaries "executed" personally but I guess besides that nothing inherent, they do however get pruned as part of our preprocessing to strip binaries from sources (see https://github.com/ungoogled-software/ungoogled-chromium/blob/master/docs/design.md#source-file-processors) and I don't see why we should change that
Web developers will heavily use JS APIs for local model inference once they become generally available. Excluding this capability is similar to removing any other JS functionality from the browser, forcing us to either use standard builds (which I prefer to avoid) or maintain a fork with the feature restored. If the project is not fully committed to removing ML inference, I would urge reconsideration - model binaries are generally harmless, even when not in the safetensors format (tflite
in this case). According to the format specification, it should not be possible for these models to phone home.
is similar to removing any other JS functionality from the browser
That's just false.
it should not be possible for these models to phone home.
And the code that is used to "run" those models? If not now, maybe eventually?
Would you care to elaborate on why you think rendering certain JS functions unusable (preventing web developers from using the browser to develop the web) is "just false"?
The code for inference is standard C++, which we could patch as necessary without compromising core functionality. I haven't read through all of it (just pieces from: https://github.com/chromium/chromium/blob/2261cbe79fb40545cbeba8617c277685960ceb44/components/translate/core/language_detection/language_detection_util.cc https://chromium.googlesource.com/chromium/src/+/refs/heads/main/third_party/blink/renderer/modules/ai/on_device_translation/ ) However, the point is that any potential "phone home" functionality would be implemented in C++, and we already have instruments to deal with that.
@lofcz Well, let’s expand on it if you so insist!
Web developers will heavily use JS APIs for local model inference once they become generally available.
Do you have any references backing this claim?
Excluding this capability is similar to removing any other JS functionality from the browser
Is this capability similar to, say, setInterval()
? I highly doubt that.
Web developers will heavily use […] […] forcing us to […]
Please do not generalise and do not speak for other people. It doesn't make for a good argument. Speak for yourself at best.
use standard builds (which I prefer to avoid)
This is not a problem of ungoogled-chromium.
maintain a fork with the feature restored
This is not a problem of ungoogled-chromium. ungoogled-chromium is freely available under BSD License with portions from Bromite project that are BSD licensed for use in ungoogled-chromium specifically.
ungoogled-chromium does have a defined set of goals and as such does not intend to solve everyone's problems.
it should not be possible for these models to phone home
Phoning home is only part of the problem. The other part being allowing an unknown binary into the build process. You may be aware of the recent XZ utils incident. Unfortunately we do not have enough manpower to inspect every binary shipped with Chromium’s huge codebase with each version bump.
[…] which we could patch as necessary […] and we already have instruments to deal with that.
Do you have a working and tested patch to neutralise the possible threat, for which you would be willing to submit a PR?
Would you care to elaborate
I do hope I have now.
@PF4Public references you asked for:
https://chromestatus.com/feature/5193953788559360 https://chromestatus.com/feature/5172811302961152
Browsers and operating systems are increasingly expected to gain access to a language model.
Browsers are increasingly offering language translation to their users. Such translation capabilities can also be useful to web developers. This is especially the case when browser's built-in translation abilities cannot help, such as: ...
Feel free to check the number of blog posts already preparing web developers for the adoption of local JS inference, including Chrome's official one.
As for the doubts on the topic of the importance of these APIs: the number of pages using the APIs currently unsupported by Ungoogle Chromium will only rise over time. The feature is already important and will only become used more frequently. The company I work for is currently paying a lot of money for SASS services doing the work clients using modern Chromium will be able to do for free, our competitors are doing the same. Once a free, locally run alternative is mainstreamly available, it will be a strong incentive for many to switch to.
OS/Platform
Windows
Installed
https://ungoogled-software.github.io/ungoogled-chromium-binaries/
Version
130.0.6723.69
Have you tested that this is not an upstream issue or an issue with your configuration?
--user-data-dir
command line argument and it could not be reproduced thereDescription
A language detection API, recently introduced in js complains about a missing model. The same code works in stock chrome, same version.
How to Reproduce?
Actual behaviour
An error is thrown.
Expected behaviour
Language detection works.
Relevant log output
No response
Additional context
No response