continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
https://docs.continue.dev/
Apache License 2.0
15.82k stars 1.2k forks source link

can not be used in local area network #471

Closed star7810 closed 3 months ago

star7810 commented 1 year ago

Describe the bug In an environment where the local area network cannot connect to the internet, even if the local code llama service is configured in config.py, it still cannot be used normally. Are there any features that must access the internet, and is it possible to localize the entire system?

Environment

sestinj commented 1 year ago

Yes, this should be possible. I've worked with 2 other people who were able to accomplish this by downloading the server and running it manually. You can do this by pip install continuedev && python -m continuedev. Then in VS Code settings, you'll want to search "continue" and check the box for "Manually Running Server".

Once the Continue Python server is running, things should work. Tomorrow I will do a nicer write up of this in our documentation

star7810 commented 1 year ago

@sestinj In fact, I have already downloaded the run.exe file within the LAN and manually started it. The server_version.txt file has also been written with the corresponding version 0.0.383. The GGML content is configured as follows:

models=Models(
    default=GGML(
        max_context_length=2048,
        server_url="http://10.97.40.91:12345"
    )
)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "websockets\legacy\server.py", line 240, in handler File "pylsp\python_lsp.py", line 135, in pylsp_ws File "websockets\legacy\protocol.py", line 497, in aiter File "websockets\legacy\protocol.py", line 568, in recv File "websockets\legacy\protocol.py", line 944, in ensure_open websockets.exceptions.ConnectionClosedError: sent 1011 (unexpected error) keepalive ping timeout; no close frame received [2023-09-13 19:07:03,852] [DEBUG] Received GUI message {"messageType":"main_input","data":{"input":"/edit write python helloworld program"}} Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection。', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection。', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection。', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection。', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection。', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection。', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection。', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection。', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None))) Error importing tiktoken HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ProxyError('Cannot connect to proxy.', ConnectionResetError(10054, 'The remote host forcibly closed an existing connection', None, 10054, None)))

kohlivarun5 commented 1 year ago

Question 2: After manually running run.exe, the "continue server starting" took a long time, and the continue backend service log can be printed normally. But the code llama service did not receive a request, it seems to be blocked in some internet request step.

I had a similar issue and had to disable telemetry in config.py

sestinj commented 1 year ago

@star7810 TL;DR: kohlivarun5 is right, once server is up and telemetry disabled things should work. I'm making a couple changes on my end and will have a full write-up to share soon.

  1. At the risk of asking about something you've already done...is the checkbox selected as in this screenshot? If yes and it's still trying to redownload, I have some serious sanity checking to do. I've tried on my computer with this box checked, even with server_version.txt and the binary removed, and it does not attempt to kill/redownload the server
Screen Shot 2023-09-13 at 9 33 17 AM
  1. Running the binary on windows can sometimes be slow to start, since it has to unpackage the contents of a zipped directory. Using the PyPI package (python -m continuedev) would be faster to startup after initial pip install, and might be more natural to start/stop the server.

  2. This test has been done on air-gapped computers, but it takes a few adjustments, as kohlivarun5 mentions above (set allow_anonymous_telemetry=False in config.py). I'm working on a full write-up of how to do this, I'll share soon

  3. We already have proxy support if you use the OpenAI class, but I've added support for GGML just now, version on its way out the door. Will work like here by setting GGML(..., proxy="<MY_PROXY_URL>").

sestinj commented 1 year ago

Here is some new documentation describing the steps you should take to run Continue without internet: https://continue.dev/docs/walkthroughs/running-continue-without-internet

star7810 commented 1 year ago

@sestinj Great, it can already be sent to the code llama backend service, but there is a problem with the returned message. Has anyone else encountered a similar problem?

star7810 commented 1 year ago

@sestinj It seems that the example code has been deleted? https://github.com/continuedev/ggml-server-example

sestinj commented 1 year ago

That's odd. I can still see the repo with example code.

There's a small chance that this is just the output of the LLM, or has this exact same response come back more than once? If so, it could be that the [INST] tags aren't allowed for some reason, but this would be new to llama-cpp-python

sestinj commented 1 year ago

Ah I see what you mean about the example code. There was never any python file there, because it just runs a pip package

star7810 commented 1 year ago

Ah I see what you mean about the example code. There was never any python file there, because it just runs a pip package

haha we try again...

Here's another suggestion: Currently, it's hard to identify the cause of an error from the interface when the model service returns a non-200 response. It would be helpful to add frontend notifications for responses with non-200 status codes.

sestinj commented 1 year ago

This is a good point, something I can do for sure.

I went through the README and set this up again, seems it should still work, except that llama-cpp-python really doesn't seem to like concurrent requests. My config file looked like this and everything was working smoothly:

from continuedev.src.continuedev.libs.llm.ggml import GGML
from continuedev.src.continuedev.libs.llm.queued import QueuedLLM

...
config = ContinueConfig(
    ...
    models=Models(
        default=QueuedLLM(
            llm=GGML(
                context_length=2048, server_url="http://localhost:8000"
            )
        ),
    ),
    disable_summaries=True,
)

The QueuedLLM wrapper makes sure that only one request happens at once, which unfortunately is necessary if working with llama-cpp-python it seems. And disable_summaries is optional, but if you're going to only allow one at a time, it doesn't make sense to force yourself to wait for the summary to be generated.

star7810 commented 12 months ago

@sestinj We have attempted several times according to the instructions at https://github.com/continuedev/ggml-server-example

See our 5 minute quickstart to run any model locally with ggml. While these models don't yet perform as well, they are free, entirely private, and run offline.

However, we encountered an exception error during startup. Do you have any ideas on this? The sha256 hash of the model file is correct. The error message is as follows:

(ggml) [lzb@VKF-NLP-GPU-01 ggml-server-example]$
(ggml) [lzb@VKF-NLP-GPU-01 ggml-server-example]$ python3 -m llama_cpp.server --model models/wizardLM-7B.ggmlv3.q4_0.bin
gguf_init_from_file: invalid magic number 67676a74
error loading model: llama_model_loader: failed to load model from models/wizardLM-7B.ggmlv3.q4_0.bin

llama_load_model_from_file: failed to load model
Traceback (most recent call last):
File "/home/lzb/.conda/envs/ggml/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/lzb/.conda/envs/ggml/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/lzb/.conda/envs/ggml/lib/python3.8/site-packages/llama_cpp/server/main.py", line 96, in
app = create_app(settings=settings)
File "/home/lzb/.conda/envs/ggml/lib/python3.8/site-packages/llama_cpp/server/app.py", line 337, in create_app
llama = llama_cpp.Llama(
File "/home/lzb/.conda/envs/ggml/lib/python3.8/site-packages/llama_cpp/llama.py", line 340, in init
assert self.model is not None
AssertionError
sestinj commented 12 months ago

I believe .ggml files have now been deprecated in favor of .gguf. https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/discussions/14#64e5bc015af55fb4d1f9b61d

I searched around for a .gguf for WizardLM and for some reason it doesn't seem to exist.

But there is a gguf for CodeLlama-instruct, which also works quite well: https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/tree/main

sestinj commented 11 months ago

@star7810 Did you have any luck here, or is there something else I can do to get you un-stuck? If this particular path of setting up an open-source model doesn't work I can share a number of other options

chewbh commented 11 months ago

Here is some new documentation describing the steps you should take to run Continue without internet: https://continue.dev/docs/walkthroughs/running-continue-without-internet

@sestinj

Is Melisearch essential for Continue to function? I noticed that that in air-gapped environment that Continue server tends to suffer from long startup time due to attempts to download Melisearch that is actually blocked and waiting for the network connection to timeout. It also don't help that the Continue server appears to delete Melisearch if it exist before redownload.

Related to it, is the design for Continue server meant to serve multiple developers? I think the setup experience is less painful if that is true. In air-gapped environment, we can then consider containerising the entire setup for Continue Server and use to serve multiple developers.

sestinj commented 11 months ago

Hi @chewbh, the server is in fact meant for multiple developers. There are a few examples of people already doing this, and our team has personally been hosting a shared server on GCP Cloud Run. If this is something you're interested in doing let me know and I can share some resources on how to do it

This Meilisearch problem is still something I'd like to get to the bottom of though - could you share what version of Continue you are using (specifically the server if you are running it manually). Just yesterday I made an update that downloads Meilisearch in parallel and fails gracefully, so that it won't block the starting of the server - so there's a chance that just upgrading would solve your problem

And to answer your first question, no. Meilisearch is what allows for the dropdown menu where you can search through files in your workspace (by typing '@ file'), but all other functionality can work without it

chewbh commented 11 months ago

Hi @sestinj Thanks for the prompt response. Yes, it will be very helpful if you could share some resources on the hosting of shared server.

For Melisearch, we are using a server that was from built from continuedev source dated in mid August. Let us try the current build and see if the issue is still a concern.

sestinj commented 11 months ago

I'm realizing that most of what I'm going to share with you was available in the link above, but here is the best way of going about running the server.

There are a few options that you can use, including:

(so you would run with python3 -m continuedev --port 1234 for example)

sestinj commented 10 months ago

@chewbh @star7810 Wanted to check in on this issue. Since the last comment we've made updates to the server protocol that should help with connection reliability and I've also tested and had success myself over LAN.

Let me know if you're still struggling to get Continue setup, and I'm happy to help! I'll try to give this issue about another week before I close it if I don't hear back

chewbh commented 10 months ago

@sestinj Thanks for checking in! I am able to get the base functionality working in air-gap environment. I have yet to try it with built-in context providers but am interesting to use the codesearch and file tree providers in our environment as well. Is there any caveat that I need to be aware of?

sestinj commented 10 months ago

@chewbh the @search and @filetree context providers for the moment depend on having the Continue server on the same machine as the code you are editing, but shouldn't be limited by the offline scenario

sestinj commented 9 months ago

@kohlivarun5 @chewbh @star7810 checking in on this issue just one more time because we've made some pretty significant and relevant changes. As of the newest VS Code pre-release, Continue no longer requires the separate Python server at all: it works as just the extension. This means that whatever connection problems were going on are pretty much not possible anymore.

Of course doesn't mean that bugs in general aren't possible, but I think it will be a better experience. If any of you get the chance to try it, let me know if you run into any problems. Otherwise I'll keep this issue open for another few days and then close it out, preferring new issues for new problems.

chewbh commented 9 months ago

@sestinj I am trying out the new pre-release VS Code extension (0.7.54) but run into new connection issues. I am running it in the below environment:

  1. Web IDE (I try both codespace and coder/code-server)
  2. Proxy Server URL set to a valid url that can resolve to the express server forked by the vs code extension (e.g. https://cuddly-rotary-phone-xxqr547547f679-65433.app.github.dev/ which point to https://localhost:65433) so that it workarounds the mixed https/http content issue.

When I run any query, I still hit into CORS issue with http 401 error in the preflight request.

sestinj commented 8 months ago

Working on a change that will make the proxy irrelevant and thus fix this, I'll ping you once it's ready

sestinj commented 8 months ago

@chewbh ready now in version 0.7.55, I set up a GitPod workspace myself to make sure it works. No need to set the proxy url. Reason is that we are instead making requests through the built-in VS Code message passing.

The one possible scenario I haven't yet verified is if you're trying to run a local model that is on your laptop rather than inside of the GitPod workspace. But OpenAI, other APIs, anything not localhost definitely seemed to work

chewbh commented 8 months ago

@sestinj Thanks for the great work! It now works in my environment.

Separately, For supporting groups or multiple users, is there plan to look at having a shared or default config.json and config.ts? We use a shared continue server previously and it is great to abstract away the configuration to hook up our internal LLM setup.

Also, with the new architecture and switch to typescript, is it still possible to add our own custom context providers? We have knowledge base in confluence and api for querying dev docs. I am interesting to look at if we could build context providers internally to enable RAG over them in Continue.

sestinj commented 8 months ago

@chewbh Yes, we 100% will add this, but will be temporarily missing until we finalize what a team server might look like. One idea that may or may not fit your requirements to somewhat avoid copying around config files for now: in config.ts, you could make an HTTP request to retrieve the configuration from a very simple server of your own. This way you could update the config remotely for everyone.

Custom context providers are still possible, and probably easier now. Here is the updated documentation.

sestinj commented 8 months ago

If you'd like any help transferring config over to config.ts, let me know! And I'd be curious to hear more about this RAG context provider if you look deeper into it: one thing we hope to revamp soon is the interaction pattern around context providers to make them more flexible.