I am trying the llm-vscode extension with llm-ls on a locally hosted endpoint (running a custom fine-tuned model), however the extension still gives a warning that I might get rate limited by HuggingFace.
Since inference doesn't run on a HuggingFace server this warning is not necessary.
I am trying the llm-vscode extension with llm-ls on a locally hosted endpoint (running a custom fine-tuned model), however the extension still gives a warning that I might get rate limited by HuggingFace.
Since inference doesn't run on a HuggingFace server this warning is not necessary.