Open IonCaza opened 5 months ago
From the short discussion in Discord it looks like this was solved by changing
"provider": "llama.cpp"
to
"provider": "openai"
I assume llama.cpp
has become more OpenAI compatible over time.
我的问题是左侧正常, 编辑器中404 ` "tabAutocompleteModel": { "title": "deepseek-1b", "provider": "openai", "model": "deepseek-coder", "apiBase": "https://api.deepseek.com/v1", "apiKey": "sk-xxxxx", "template": "deepseek" }
`
@IonCaza fry69's response is correct. The "llama.cpp" provider was intended for a time when their built-in server wasn't OpenAI compatible. From the response you're getting there it seems that they are now fully OpenAI-compatible, but looking at their docs that isn't yet clear: https://github.com/ggerganov/llama.cpp/tree/master/examples/server#api-endpoints
If this becomes official we'll definitely just remove the llama.cpp provider, or make it a subclass of "openai"
@fykyx521 DeepSeek's API does not support autocomplete, even though the model does. I know this is surprising and I'm trying to get in touch with them to add support. Until then, you can either run deepseek coder locally with Ollama, or you can use Codestral, which is currently the best autocomplete model available, and is free until August
@fykyx521 DeepSeek's API does not support autocomplete, even though the model does. I know this is surprising and I'm trying to get in touch with them to add support. Until then, you can either run deepseek coder locally with Ollama, or you can use Codestral, which is currently the best autocomplete model available, and is free until August
They have added a configuration example that works. https://github.com/deepseek-ai/awesome-deepseek-integration/tree/main/docs/continue
{
"tabAutocompleteModel": {
"title": "DeepSeek-V2",
"model": "deepseek-coder",
"apiKey": "sk-xxx",
"contextLength": 8192,
"apiBase": "https://api.deepseek.com",
"completionOptions": {
"maxTokens": 4096,
"temperature": 0,
"topP": 1,
"presencePenalty": 0,
"frequencyPenalty": 0
},
"provider": "openai",
"useLegacyCompletionsEndpoint": false
},
"tabAutocompleteOptions": {
"useCache": true,
"maxPromptTokens": 2048,
"template": "Please teach me what I should write in the `hole` tag, but without any further explanation and code backticks, i.e., as if you are directly outputting to a code editor. It can be codes or comments or strings. Don't provide existing & repetitive codes. If the provided prefix and suffix contain incomplete code and statement, your response should be able to be directly concatenated to the provided prefix and suffix. Also note that I may tell you what I'd like to write inside comments. \n{{{prefix}}}<hole></hole>{{{suffix}}}\n\nPlease be aware of the environment the hole is placed, e.g., inside strings or comments or code blocks, and please don't wrap your response in ```. You should always provide non-empty output.\n"
}
}
I noticed that I had to restart vscode for this to work. I think pretty much whenever I edit the tabAutocompleteModel
I have to restart vscode.
Before submitting your bug report
Relevant environment info
Description
see 404 above, Continue VSCode extension with the following model config:
is hitting the wrong endpoint (completion instead of completions) - works fine from Swagger docs as you can see a successful request above the 404.
To reproduce
Host deepseek-33b with latest llama-cpp-python Configure per steps above Try sending a request, and you'll receive a 404.
Log output
No response