continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
https://docs.continue.dev/
Apache License 2.0
16.07k stars 1.23k forks source link

[Extension Host] FetchError2: request to http://127.0.0.1:8080/completion failed #868

Open prathameshza opened 7 months ago

prathameshza commented 7 months ago

Before submitting your bug report

Relevant environment info

- OS: Linux Mint 21.3 "Virginia" Cinnamon Edition 64-bit
- Continue: v0.8.12
- IDE: 1.86.1
- Model: WizardCoder-7b (llama.cpp)

Description

Getting FetchError 2 in vscode when trying to run WizardCoder-7b (llama.cpp) Bellow is the screenshot:

continue_error

To reproduce

Add model to config.js

    {
      "title": "WizardCoder-7b",
      "model": "wizardcoder-7b",

      "requestOptions": {
        "caBundlePath": "/media/prath/Main Disk/Programming/FYP/MultiPDFchatMistral-7B/cert.pem"
      },

      "completionOptions": {},
      "provider": "llama.cpp"
    }

Note: I have read the troubleshooting page and added the certificate

I have generated the certificates like this:

sudo apt-get install openssl

openssl req -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem -days 3650

openssl rsa -in key.pem -out nopassword.key

cat nopassword.key > server.pem
cat cert.pem >> server.pem

my folder dir looks like this

folder_dir

Log output

[Extension Host] FetchError2: request to http://127.0.0.1:8080/completion failed, reason: connect ECONNREFUSED 127.0.0.1:8080
    at ClientRequest.<anonymous> (/home/prath/.vscode/extensions/continue.continue-0.8.12-linux-x64/out/extension.js:160783:14)
    at ClientRequest.emit (node:events:526:35)
    at Socket.socketErrorListener (node:_http_client:501:9)
    at Socket.emit (node:events:514:28)
    at emitErrorNT (node:internal/streams/destroy:151:8)
    at emitErrorCloseNT (node:internal/streams/destroy:116:3)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
notificationsAlerts.ts:42 Continue error: request to http://127.0.0.1:8080/completion failed, reason: connect ECONNREFUSED 127.0.0.1:8080
sestinj commented 7 months ago

@prathameshza is there a curl request to this same URL that you can get to succeed? And you are running llama.cpp's non-OpenAI ./server example, right?

prathameshza commented 7 months ago

@prathameshza is there a curl request to this same URL that you can get to succeed? And you are running llama.cpp's non-OpenAI ./server example, right?

I tried to ping the address which gives error. Also I missed to configure the model, so i followed this steps: config

But I am getting this error:

llm_load_print_meta: EOS token        = 2 '</s>'
llm_load_print_meta: UNK token        = 0 '<unk>'
llm_load_print_meta: LF token         = 13 '<0x0A>'
llm_load_tensors: ggml ctx size =    0.00 MiB
llama_model_load: error loading model: create_tensor: tensor 'token_embd.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'models/ggml-vocab-llama.gguf'
{"timestamp":1708505822,"level":"ERROR","function":"load_model","line":380,"message":"unable to load model","model":"models/ggml-vocab-llama.gguf"}
terminate called without an active exception
Aborted (core dumped)

After this command:

./server -c 4096 --host 0.0.0.0 -t 16 --mlock -m models/ggml-vocab-llama.gguf