LM Studio partial support

DiAifU commented 8 months ago

What happened?

Hi,

LM Studio, with a locally running model, is working by using OpenAI with a Preset of Ollama or Lama Cpp and changing the port to the default LM Studio one (1234). The prompts are being generated but always end up with the error "Unknown API response. Code: 200, Body:".

Could you please tell me how to fix this or add support for responses being generated by LM Studio ?

Thanks for the great plugin !

Nicolas

Relevant log output or stack trace

Unknown API response. Code: 200, Body:

Steps to reproduce

Run LM Studio with a model locally
Start a server in LM Studio
Configure CodeGPT to target that server (OpenAI + Ollama or Lama Cpp Preset + change port to 1234)

CodeGPT version

2.4.0

Operating System

Windows

carlrobertoh commented 8 months ago

Thank you for reporting!

The error appears to be related to the OkHttp library and how it processes the event streams. For some reason, LM Studio doesn't seem to append empty newline at the end of the final response, causing OkHttp to fail. I am not yet sure what the fix is.

raivisdejus commented 8 months ago

Maybe @lmstudio-ai can sort this out on their end...

carlrobertoh commented 8 months ago

https://github.com/langchain4j/langchain4j/issues/670 related issue.

The error java.lang.IllegalArgumentException: byteCount < 0: -1 can be reproduced by removing the empty newlines from the mocked response: LocalCallbackServer.java#L111

lucacri commented 6 months ago

Same problem here, any suggestions on how to temporarily fix it?

xardbaiz commented 3 days ago

Hello, everyone and especially @lucacri & @DiAifU. If it's still relevant - because LM Studio has OpenAI-like server API - CodeGPT partially supports it via "Custom OpenAI" provider, I just checked it.

Settings

1. Start LM Studio Server and remember loaded model name

2. Select `Custom OpenAI` provider in CodeGPT

3. Point your `Custom AI` provider to localhost

And don't forget to write correct model name loaded in LM studio Important! For code completions you model should support FIM pattern

Result demo

CodeGPT screen

LM Studio

LM Studio logs

```sl4j 2024-11-13 13:55:52 [INFO] Received POST request to /v1/chat/completions with body: { "stream": true, "model": "stable-code-instruct-3b", "messages": [ { "role": "system", "content": "You are an AI programming assistant.\nFollow the user's requirements carefully & to the letter.\nYour responses should be informative and logical.\nYou should always adhere to technical information.\nIf the user asks for code or technical questions, you must provide code suggestions and adhere to technical information.\nIf the question is related to a developer, you must respond with content related to a developer.\nFirst think step-by-step - describe your plan for what to build in pseudocode, written out in great detail.\nThen output the code in a single code block.\nMinimize any other prose.\nKeep your answers short and impersonal.\nUse Markdown formatting in your answers.\nMake sure to include the programming language name at the start of the Markdown code blocks.\nAvoid wrapping the whole response in triple backticks.\nThe user works in an IDE built by JetBrains which has a concept for editors with open files, integrated unit test support, and output pane that shows the output of running the code as well as an integrated terminal.\nYou can only give one reply for each conversation turn." }, { "role": "user", "content": "Please write example java MapStruct mapper" } ], "temperature": 0.1, "max_tokens": 1024 } 2024-11-13 13:55:52 [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 2 messages. 2024-11-13 13:55:52 [INFO] [LM STUDIO SERVER] Streaming response... 2024-11-13 13:55:52 [INFO] [LM STUDIO SERVER] First token generated. Continuing to stream response.. 2024-11-13 13:56:02 [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation... (if the model is busy processing the prompt, it will finish first)) 2024-11-13 13:56:02 [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation.. 2024-11-13 13:56:02 [INFO] Finished streaming response ```

carlrobertoh commented 2 days ago

Awesome, thank you!

We could add a preset template for it, similar to how others are done: https://github.com/carlrobertoh/CodeGPT/blob/master/src/main/kotlin/ee/carlrobert/codegpt/settings/service/custom/template/CustomServiceTemplate.kt

xardbaiz commented 2 days ago

@carlrobertoh Ok. Let me try to find time for this. Will add. Wait for PR from mine :)

carlrobertoh / CodeGPT