bug: Using Cody with vLLM Model Server

davidamacey commented 8 months ago

Version

1.2.1

Describe the bug

Using Cody with a user selected 'unstable-openai' model, I entered the URL and key for my local vLLM or local Ollama server running my model, both within a docker container. The Cody output is as follows:

at async generateCompletions (/home/user/.vscode-server/extensions/sourcegraph.cody-ai-1.2.1/dist/extension.node.js:168930:26)
█ getInlineCompletions:error: error parsing streaming CodeCompletionResponse: Error: {"error":"Sourcegraph Cody Gateway: unexpected status code 400: {\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"message\":\"prompt must end with an \\\"\\n\\nAssistant:\\\" turn\"}}"} Error: error parsing streaming CodeCompletionResponse: Error: {"error":"Sourcegraph Cody Gateway: unexpected status code 400: {\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"message\":\"prompt must end with an \\\"\\n\\nAssistant:\\\" turn\"}}"}
    at Object.complete (/home/user/.vscode-server/extensions/sourcegraph.cody-ai-1.2.1/dist/extension.node.js:174781:15)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async generatorWithTimeout (/home/user/.vscode-server/extensions/sourcegraph.cody-ai-1.2.1/dist/extension.node.js:165913:29)
    at async fetchAndProcessCompletions (/home/user/.vscode-server/extensions/sourcegraph.cody-ai-1.2.1/dist/extension.node.js:169835:45)
    at async Promise.all (index 0)
    at async zipGenerators (/home/user/.vscode-server/extensions/sourcegraph.cody-ai-1.2.1/dist/extension.node.js:165884:17)
    at async generateCompletions (/home/user/.vscode-server/extensions/sourcegraph.cody-ai-1.2.1/dist/extension.node.js:168930:26)
█ logEvent (telemetry disabled): CodyVSCodeExtension:completion:error VSCode {"properties":{"message":"error parsing streaming CodeCompletionResponse: Error: {\"error\":\"Sourcegraph Cody Gateway: unexpected status code 400: {\\\"type\\\":\\\"error\\\",\\\"error\\\":{\\\"type\\\":\\\"invalid_request_error\\\",\\\"message\\\":\\\"prompt must end with an \\\\\\\"\\\\n\\\\nAssistant:\\\\\\\" turn\\\"}}\"}","count":9},"opts":{"agent":true,"hasV2Event":true}}
█ telemetry-v2: recordEvent: cody.completion/error: {"parameters":{"version":0,"metadata":[{"key":"count","value":9}],"privateMetadata":{"message":"error parsing streaming CodeCompletionResponse: Error: {\"error\":\"Sourcegraph Cody Gateway: unexpected status code 400: {\\\"type\\\":\\\"error\\\",\\\"error\\\":{\\\"type\\\":\\\"invalid_request_error\\\",\\\"message\\\":\\\"prompt must end with an \\\\\\\"\\\\n\\\\nAssistant:\\\\\\\" turn\\\"}}\"}"}},"timestamp":"2024-01-30T22:34:12.957Z"}

Expected behavior

vLLm input should be the same as OpenAI as it is compliant with the standard.

I was expecting the vLLM be a drop in for OpenAI ChatGPT, as I have done that for other applications.

Both vLLM and ollama are the latest docker containers.

Additional context

vLLM is running in a docker container on my local network.

olafgeibig commented 8 months ago

same same. Regardless which OpenAI compatible API endpoint I use. TogetherAI, Anyscale, LiteLLM - you name it. Even https://api.openai.com/v1 doesn't work

    "cody.autocomplete.advanced.provider": "unstable-openai",
    "cody.autocomplete.experimental.ollamaOptions": {

        "url": "http://localhost:11434",
        "model": "codellama"
    },
    "cody.autocomplete.advanced.accessToken": "sk-xxx",
    "cody.autocomplete.advanced.serverEndpoint": "https://api.openai.com/v1"
}

Cody VS Code: v1.3.1706715785, MacOS 13.6 (22G120)

github-actions[bot] commented 3 months ago

This issue is marked as stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed automatically in 5 days.

sourcegraph / cody