Error in Tabby deployment - llama_cpp_bindings::llama: crates/llama-cpp-bindings/src/llama.rs

mprudra commented 6 months ago

Describe the bug I'm noticing below error with our Tabby deployment, looks like a memory error. Don't have any additional logs, since we've modified the logs to mask input, output information, this was needed for production deployment. Process exit code was 1.

cmpl-dc7c656b-2a60-4276-8940-2a578d26e198: Generated 2 tokens in 56.007768ms at 35.709332319759646 tokens/s
cmpl-9c5e112f-5024-4d1b-a7b4-5a3f5dab21c2: Generated 2 tokens in 80.706173ms at 24.781251862853164 tokens/s
2024-03-11T23:00:58.450411Z ERROR llama_cpp_bindings::llama: crates/llama-cpp-bindings/src/llama.rs:78: Failed to step: _Map_base::at

Information about your version 0.5.5

Information about your GPU

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05             Driver Version: 535.154.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
...
...
...
|   3  NVIDIA A100 80GB PCIe          On  | 00000000:E3:00.0 Off |                    0 |
| N/A   44C    P0              74W / 300W |  18141MiB / 81920MiB |      0%   E. Process |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+

wsxiaoys commented 6 months ago

Hi, thanks for reporting the issue. Would you please upgrade to 0.9.0 to see if the problem still persist?

mprudra commented 6 months ago

Would require significant efforts, will keep this as last resort. Do you have any idea about what could be the cause of this error? Is this issue known to some previous versions?

sergei-dyshel commented 5 months ago

Happens for me too, on 0.9.1, when running with delwiv/codefuse-deepseek-33B model. Doesn't happen with TabbyML/DeepseekCoder-6.7B model.

wsxiaoys commented 5 months ago

Happens for me too, on 0.9.1, when running with delwiv/codefuse-deepseek-33B model. Doesn't happen with TabbyML/DeepseekCoder-6.7B model.

Could you also share the log output and your system info?

wsxiaoys commented 5 months ago

@mprudra could you share the model you were using when encountering the issue?

mprudra commented 5 months ago

Happens for me too, on 0.9.1, when running with delwiv/codefuse-deepseek-33B model. Doesn't happen with TabbyML/DeepseekCoder-6.7B model.

...

Seems related: ggerganov/llama.cpp#3959 ggerganov/llama.cpp#4206

@mprudra could you share the model you were using when encountering the issue?

~~I'm also using our fined-tuned version of DeepSeekCoder-33b.~~ Correction: I had noticed it with 6.7B model.

mprudra commented 5 months ago

Is it the case that Deepseek-Coder models aren't yet supported? (Deepseek coder merge, ggerganov/llama.cpp#5464)[https://github.com/TabbyML/tabby/issues/1666]

gyxlucy commented 5 months ago

https://github.com/ggerganov/llama.cpp/issues/5981 is the latest issue opened to support deepseek in llama.cpp

TabbyML / tabby

Error in Tabby deployment - llama_cpp_bindings::llama: crates/llama-cpp-bindings/src/llama.rs #1666