continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
https://docs.continue.dev/
Apache License 2.0
18.79k stars 1.59k forks source link

Why does this extension need full blown Chromium.app? #2166

Open martincerven opened 2 months ago

martincerven commented 2 months ago

Before submitting your bug report

Relevant environment info

- OS:
- Continue:
- IDE:
- Model:
- config.json:

Description

There is Chromium.app in ~/.continue/.utils/.chromium-browser-snapshots/chromium/ installed without any user consent at all.

To reproduce

No response

Log output

No response

martincerven commented 2 months ago

@Patrick-Erichsen indexing? Can you provide more info? It seems that Chromium was downloaded with mere extension update...really?

Patrick-Erichsen commented 2 months ago

Hey @martincerven , appreciate the feedback. This is for the documentation service. We just added a note here about why this is needed: https://github.com/continuedev/continue/blob/dev/docs/docs/features/talk-to-your-docs.md#how-it-works

Docs crawling happens entirely on a users local machine, so to handle sites with Javascript enabled we decided to pull down Chromium on install. Without this the majority of docs sites can't be crawled.

Our aim with this is to be more privacy preserving by allowing users to perform indexing locally rather than through our own servers, but curious to know if this is still behavior you'd prefer to disable.

otopetrik commented 1 month ago

This is terrifying.

An extension should never just silently download and execute some binary files from the internet.

And definitely not without getting permission from the user first. That is a very sneaky behavior, and it opens up the question, whether the code does anything else unexpected/unwanted.

This is for the documentation service. We just added a note here about why this is needed: https://github.com/continuedev/continue/blob/dev/docs/docs/features/talk-to-your-docs.md#how-it-works

As of now, the documentation page still does not list the information about chromium download.

There is no information about the origin of the chromium binary (who built it?).

On a NixOS machine with working "chromium" (and "chrome") accessible in PATH, the extension (JetBrains variant) silently downloaded chromium from somewhere, executed it, and it failed with:

Error: Failed to launch the browser process!
/home/<username>/.continue/.utils/.chromium-browser-snapshots/chromium/linux-1350578/chrome-linux/chrome: error while loading shared libraries: libglib-2.0.so.0: cannot open shared object file: No such file or directory

From the sources it looks like it uses binaries built by Google, and it looks like the download at least uses "https" (no idea if there is any verification of signatures or at least checksums).

Given the sneaky nature of silent installation, it would make sense to question/verify whether the installed extension is actually clean build of the source from github (without any malicious changes). Does it download clean or backdoored chromium binary?

(It looks like the contents of continue-binary file in the installed JetBrains extension matches github code at least at configuring PCR_CONFIG - it configures only downloadPath (no hosts set), and import_puppeteer_chromium_resolver/require_lib13 falls back to https://storage.googleapis.com. Of course that is not a guarantee that there are not any malicious changes in the code further down.)

As there is a funded company behind this plugin (and not just a pseudonymous developer as was the case in xz utils), it is likely not developed as a backdoor distribution mechanism, but the "silently download binary from internet and execute it" behavior looks terrifyingly close to one.

Docs crawling happens entirely on a users local machine, so to handle sites with Javascript enabled we decided to pull down Chromium on install. Without this the majority of docs sites can't be crawled.

It is possible that some sites cannot be crawled without a chromium browser. It is impossible, that the majority of sites cannot be crawled without the extension downloading chromium browser.

Need chrome or chromium browser? Fine. If it is possible to use normal installation of a browser, just check whether it is installed, and if not, ask the user to install it. If a specific version of chromium is really required, then download it only after the user added something like "allowChromiumDownload": true, to the config file. If the line is not there, it might be good idea to explain what is going on, and present the user with URL of required chromium binary. Allow them to download it manually and save it in a specific directory as a fallback. That might also be useful for indexing internal documentation in an air-gapped network.

Our aim with this is to be more privacy preserving by allowing users to perform indexing locally rather than through our own servers, but curious to know if this is still behavior you'd prefer to disable.

Using local chrome/chromium could be reasonable idea (e.g. can index internal documentation sites, etc...) - assuming it does not use the user's actual chromium profile, chromium sandboxing is enabled, and the browser is kept updated.

martincerven commented 1 month ago

Yeah, it's very similar to xz and also Crowdstrike where they pushed update to prod and it crashed 10% of windows machines.

Here it was also just update , it's very contrasting with for example Llama.cpp where they want to reimplement functionality to be not dependend even on other FOSS libraries.

So for me questions are:

For me, the point of using open source extension is that anything can be checked by community, sneakily downloading some random binary from god knows where runs directly in opposition to this.

@Patrick-Erichsen can you comment on these points?

Right now this just seems that instead of Chromium.app, you can also download Malware.app without any user consent, or anything really, which is very dangerous precedent, more so for free and open source vscode extension.

Huge commented 1 month ago

Oh, thank you @martincerven for bringing that up! It's also very concerning for disk space savy individuals, 541M is accounted for /home/huge/.continue/.utils/.chromium-browser-snapshots which would be like 5 % of my workspace backup.

@martincerven : could you please tidy up the OP a bit? Maybe adding what commit or which version was the last safe one. Edit: This went in most likely with this, which happened 2 weeks ago. I'll try to look further to check whether the extension version 8.5 is clean of this...

Edit: removing it from CLI did not break the basic functionality for me, so I'd advise savy users to do that for now.

KMouratidis commented 1 month ago

Skipping the paranoia (which everyone should have), it would be nice if users had the option of managing the chromium installation themselves and simply adding a config with the path to it. This would also let users update (or pin?) their chromium binaries, and possibly using a custom-compiled chromium (or ungoogled-chromium?).

eirnym commented 1 month ago

@Patrick-Erichsen it's a not an acceptable implementation. User privacy and choices in open source products is not an option or a feature. It's basics

I'd consider this feature only If all following points will be implemented:

  1. This would be an explicit opt-in feature
  2. Only user would be responsible to download and install engine of some kind
  3. Only user would be responsible of URLs accessed by the tool
  4. Consider an option to use non-js documentation fetching, so no browser is used.
  5. User will be given a choice which browser to use. There's plenty of them.
  6. Please also remember about Firefox-only users. This is a fully capable browser to download required data
Huge commented 1 month ago

Small guidance on avoiding the bloating util for now: Download continue-linux-arm64-0.9.197.vsix or continue-linux-arm64-0.8.46.vsix from GH release page and install it manually: image

Props to @sestinj to at least advertise clearly the headless browser is to be used, in the v8.47 release notes.

av commented 1 month ago

To everyone arguing about explicit opt-in, this is the same level/type of dependency as everything from the continue package.json, I doubt that you really mean that all of the dependencies have to be opt-in.

It's puzzling to see security/privacy concerns too, as the installation above happens in an extension which was already allowed to do everything it needs on the user's machine, so any malicious intents already had a chance to have been executed.

With that, it's a completely reasonable ask to allow configuring the type of crawling that is performed (plain/rich), try reusing already installed browser(s) and optimise downloads to use lighter Chromium versions when the download is necessary, or use VS Code's Web Views. I'm sure maintainers will get there once this feature will have enough use. It's not completely reasonable, however, to see such an acute backslash, as all of the concerns (third-party code execution, disk usage bloat) are pretty much a given when installing this or any other kind of extensions for VS Code.

eirnym commented 1 month ago

@av some dependencies can be opt-in as an external pre-installed application is used.

Some dependencies like chromium are ok if you want to do something fast or the only browser you know is chromium based. Also using an existing browser instead of a browser from a dependency provide some important cookies and more control from a user.

Also a preinstalled browser is usually managed in companies, which would require a way more settings than author envisioned for this project and more hassle for a user to set them all.

animaldomestico commented 1 month ago

Also, they should take care of executing the browser inside a sandbox environment and make sure it is updated to the most stable version. There are many exploits out there in the wild.

I'm not a hacker guys (I'm just a peaceful animal), but I as I'm using Ubuntu, I was a little bit concerned about somethings:

If you want to run developer builds of Chromium/Chrome on Ubuntu 23.10+ (or possibly other Linux distros in the future), you'll need to either globally or selectively disable an Ubuntu security feature.

But if you do this, they say:

For a while, user namespaces have been available to unprivileged (e.g. non-root) users on most Linux distros, but they exposed a lot of extra kernel attack surface.

One explanation found here:

In a report from Google, 44% of the exploits they saw required unprivileged user namespaces as part of their exploit chain.

I prefer to not turn off Ubuntu security feature, so I won't use this for now. Forgive me if I said anything wrong, I just tried to help!

sestinj commented 1 month ago

Thanks to everyone who shared their feedback in this thread. We heard you loud and clear and have taken steps to address this both immediately and in the future.

As a principle, we will not dynamically download executables without user visibility. PR #2192 makes the change so that we fall in line with this principle for Chromium (it is entirely opt-in):

These updates are now available in VS Code pre-release v0.9.207, will be released later today in a Jetbrains EAP, and as soon as these pre-releases have undergone the same initial testing we do each time, they will become main releases

There were also a few points in this thread worth addressing:

Hopefully it is understood by now that Continue takes great effort to secure your code, to the point of operating as a local-first application. In considering the trade-offs between hosting our own web crawling servers, to which the extension would have to send requests, vs. following the local-first pattern, we took this lens, but more than anything we value feedback. So again, thanks all for being swift to call us out, and thanks @Patrick-Erichsen for being just as swift in taking the necessary action.

I'll hold off on closing the issue for a minute so as not to be discouraging of further discussion!

eirnym commented 1 month ago

@sestinj thank you for step out and answer our questions. My concern is still there about mandatory settings and addons, which managed by a company for all browsers.

Additionally, it has no managed settings by a user (including cookies) and plugins such as to block ads and/or improve privacy of any kind. I don't like an idea to be tracked via an application.

The other way around the issue would be provide a separate downloader program which would download nesesery raw data including the output format application uses. The latter is for a possibility to create an alternative downloader applications if anybody of us would be willing to address.

martincerven commented 1 month ago

Thanks @sestinj and @Patrick-Erichsen for quick action, I was being downvoted to hell for bringing this up, but I felt it was a security issue, although I couldn't put my finger on exactly what irked me.

Now, I know there are few security points, some independent of continue:

Lastly,

Hopefully it is understood by now that Continue takes great effort to secure your code, to the point of operating as a local-first application. In considering the trade-offs between hosting our own web crawling servers, to which the extension would have to send requests, vs. following the local-first pattern, we took this lens, but more than anything we value feedback.

I'm very happy you took local-first approach even when we voice our concerns here. I honestly doesn't see inside how this crawling works, but last imaginary scenario:

Anyway, thanks @sestinj for addressing this issue, it will ultimately make your product better and more secure.

Martin

itpofy2024o commented 1 month ago

https://discussions.apple.com/thread/8582300?sortBy=rank to remove the notification, rm -rf .continue, stop using continue extension, report this app until they actually improve

sestinj commented 1 month ago

Appreciate the further thoughts here! We've thought about this pretty deeply, trying to take into account all of the feedback received and where we want to go with the product. Without committing to a particular direction, we are tentatively looking into building out an indexing server.

Though things are much better with the headless browser being entirely opt-in, I still wanted to give an update so you know we haven't simply forgotten about this : )

I will make sure to update here as soon as we have more info!

remixer-dec commented 3 weeks ago

Using electron was not enough, now every extension of every electron app will install its own chromium! Now I have 3 additional chromiums in my system, thanks!

изображение
remixer-dec commented 3 weeks ago
  • Why can't we use the Chromium that is already installed for Google Chrome or otherwise? Puppeteer, the package used to control the headless browser, requires a specific chromium_revision for each version of the library, so we can’t easily allow users to manage the download/installation, or use existing installations

I am pretty sure that is not true, or at least it was the other way a few months ago when I worked with it.

screenshot

You just need to set PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true to install it without chromium and then you can run any chromium binary if you have access to it, some features may be less compatible in different versions, but the core functionality remains the same.

Patrick-Erichsen commented 2 weeks ago

@remixer-dec thanks for sharing that screenshot, I believe we gave that a try and ran into issues with Puppeteer complaining about an incompatible Chromium revision though. Will plan to try it out again though when we circle back to some work we have planned around docs service in the near future 👍