huggingface / llm-vscode

LLM powered development for VSCode
Apache License 2.0
1.23k stars 133 forks source link

windows-specific path problems? #92

Closed ArneRonneburg closed 9 months ago

ArneRonneburg commented 1 year ago

When I try to use the extension with a custom code-completion server, I get the following error code. Die Syntax für den Dateinamen, Verzeichnisnamen oder die Datenträgerbezeichnung ist falsch. (os error 123) I am using a Windows OS, might that problem be related to this? How can I solve it?

McPatate commented 1 year ago

That's odd, I'm pretty sure I used platform agnostic path operators to generate paths in the extension.

ArneRonneburg commented 1 year ago

I verified that it's a windows issue...same settings, but Ubuntu VM, and everything is fine...

McPatate commented 1 year ago

Thanks for checking.

If someone wants to take a look, please do! I'll eventually come back to this when I have the time.

Nicricflo commented 1 year ago

I am getting the same error message (but in English). I am specifically trying to load the LLM from a custom endpoint.

McPatate commented 1 year ago

Are you on Windows @Nicricflo?

Nicricflo commented 1 year ago

@McPatate yes I am

McPatate commented 1 year ago

Ah, well for now consider it broken on windows, I have to investigate what's going on.

jklj077 commented 1 year ago

I got the same error on Windows. I think it is related to this line. If using a custom endpoint and the tokenizer config is set to download from a Hugging Face repository, the code uses model (starting with http://... or https://...) instead of the repository to create a directory to store the tokenizer file, which will contain illegal characters (:) for Windows paths.

After setting the tokenizer config correctly following https://github.com/huggingface/llm-vscode#tokenizer (not the third one, ofc), this issue is now gone for me. However, I think it would be probably better to provide some tips in case misconfiguration.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 30 days with no activity.

VfBfoerst commented 11 months ago

I got the same error on Windows. I think it is related to this line. If using a custom endpoint and the tokenizer config is set to download from a Hugging Face repository, the code uses model (starting with http://... or https://...) instead of the repository to create a directory to store the tokenizer file, which will contain illegal characters (:) for Windows paths.

After setting the tokenizer config correctly following https://github.com/huggingface/llm-vscode#tokenizer (not the third one, ofc), this issue is now gone for me. However, I think it would be probably better to provide some tips in case misconfiguration.

I can confirm that. After setting to "null", the error is gone and the extension sends requests to my api (text-generation-inference container). It doesn't work as expected though (answers do not contain the wanted results) at the moment, but I can see autocompleted code in my workspace.

VfBfoerst commented 11 months ago

Got it to work after manually downloading the tokenizer.json from the huggingface hub (in my case starcoder). After that, i set the tokenizer regarding to the documentation:

    "llm.tokenizer": {
        "path": "C:\\Path\\to\\my\\downloaded\\tokenizer.json"
    },

The double "\" seems to be important.
Now completion works as expected!

RachelShalom commented 11 months ago

Ihave the same issue on windows using startcoder costume endpoint. so I downloaded the tokenizer and added this to the settings: `
llm.tokenizer": {

    "path": "C:\\Users\\user_name\\Downloads\\tokenizer.json"

},`

and now I get the following error:
error sending request for url (https://host:port/generate): error trying to connect: received corrupt message of type InvalidContentType

any idea on how to solve this?

McPatate commented 11 months ago

@RachelShalom this seems unrelated, please open a new issue.

github-actions[bot] commented 10 months ago

This issue is stale because it has been open for 30 days with no activity.

darolt commented 10 months ago

I am having the same problem here. It works perfectly on Linux (Ubuntu), but not working on Windows (I got the same error message). I'm on v0.1.6.

darolt commented 10 months ago

I added the tokenizer as a local file as @RachelShalom did. It seems to have avoided this particular bug, but it uncovered another bug that only happens on Windows: error sending request for url (my endpoint url here): error trying to connect: invalid peer certificate: UnknownIssuer.

Again, the same token and config works on Ubuntu.

r5r3 commented 10 months ago

Same Problem here with custom endpoint and extension version v0.1.6. On Linux and macOS it works perfectly, on Windows I get (os error 123)

github-actions[bot] commented 9 months ago

This issue is stale because it has been open for 30 days with no activity.

McPatate commented 9 months ago

The issue should be fixed in 0.5.x, can someone confirm it's working now?

r5r3 commented 9 months ago

For me, this error is solved. Thanks a lot!