llama_cpp_server::supervisor: crates/llama-cpp-server/src/supervisor.rs:94: llama-server <chat> exited with status code 1

TabbyML / tabby

Self-hosted AI coding assistant

https://tabby.tabbyml.com/

Other

21.31k stars 959 forks source link

llama_cpp_server::supervisor: crates/llama-cpp-server/src/supervisor.rs:94: llama-server <chat> exited with status code 1 #2544

Closed danny-su closed 2 months ago

danny-su commented 3 months ago

Describe the bug /opt/homebrew/bin/tabby serve --device metal --port 8088 --model TabbyML/CodeGemma-2B --chat-model Deepseek-V2-Lite-Chat --parallelism 1

Information about your version Please provide output of tabby --version tabby 0.13.0

Information about your GPU Please provide output of nvidia-smi mps

Additional context Add any other context about the problem here.

wsxiaoys commented 3 months ago

Seems the prompt template is broken for Deepseek lite chat - looking

wsxiaoys commented 3 months ago

After investigation - i noted the llama.cpp version we pinned hasn't supported deepseek v2 style chat template. As a result it's not usable with latest tabby distribution. I've removed it from the registry.

As workaround, please follow discussion of https://github.com/TabbyML/tabby/issues/2451 to see how to connect tabby to an external http endpoint (that has support for deepseek v2)

kibabyte commented 3 months ago

any ETA when this will be usable out of the box?

wsxiaoys commented 3 months ago

Tabby does bi-weekly patch release - as long as the fix is integrated in upstream llama.cpp, we shall be able to integrated them.

wsxiaoys commented 2 months ago

Please take a look at 0.13.1-rc.3, where Deepseek-Lite-V2 is now ready.

kibabyte commented 2 months ago

Thank you very much, will look into it!

kibabyte commented 2 months ago

I have been unable to find "0.13.1-rc.3" in this repo. Mind pointing me in the right direction please?

wsxiaoys commented 2 months ago

Release link: https://github.com/TabbyML/tabby/releases/tag/v0.13.1-rc.6 Docker image: https://hub.docker.com/layers/tabbyml/tabby/0.13.1-rc.6/images/sha256-5e2e5c524f00124e1f491390e43a2f455cd766868f0fd2be5dffeb214e773532?context=explore

kibabyte commented 2 months ago

What would be the model_ids I would need for the command on Windows? I am currently trying to test but unable to find the proper ID to put in.

wsxiaoys commented 2 months ago

Hi - it's not in official registry, but you might try creating one by yourself, or you can also use my forked registry at https://github.com/wsxiaoys/registry-tabby (wsxiaoys/Deepseek-V2-Lite-Chat)

moqimoqidea commented 2 months ago

@wsxiaoys

Hi, In my tests llama.cpp b3267 outputs nonsensical GGGGG content, similar to the community issue 8254 behavior, and looks like it needs to be upgraded to version llma.cpp again.

llama.cpp issue 8254: Bug: Failed to load quantizied DeepSeek-V2-Lite-Chat model

wsxiaoys commented 2 months ago

I also noticed the issue and it seems really the problem with system message - I've send out a patch https://github.com/TabbyML/tabby/pull/2596 to fix it for 0.13.1 (it's removed in main branch anyway). Will tag a new rc soon.

moqimoqidea commented 2 months ago

The patch you are referencing appears to be the chat feature, I tested it with llama.cpp b3267 version using tabby server v0.13.1-rc6 and noticed that the templates in the chat feature seem to be behaving with this instead of the user's content. I noticed that the file you are referencing is openai_chat.rs, not sure if it will affect the output of llama.cpp as the engine. I report this issue to share information, and I will look into this problem later.

Let me clarify one thing: in my previous test, I tested code completion using DeepSeek-Coder-V2-Lite-Base, which outputs nonsensical GGGGG content. Share my progress with you: I just used the newer llama.cpp b3334 version and my case is working fine. FYI.

wsxiaoys commented 2 months ago

The release https://github.com/TabbyML/tabby/releases/tag/v0.13.1-rc.8 is ready for testing for the DeepSeek-V2-Lite-Chat. Please give it a try.

Hi @moqimoqidea - if you still encounter errors with other models, please file a new issue for tracking. Thank you!

wsxiaoys commented 2 months ago

Fixed in release https://github.com/TabbyML/tabby/releases/tag/v0.13.1