Permission denied (os error 13) with linux standalone

Mte90 commented 1 day ago

Describe the bug

⠙     0.081 s   Starting...The application panicked (crashed).
Message:  Failed to start llama-server <embedding> with command Command { std: "/media/mte90/Doh-cker/tabby_x86_64-manylinux2014-cuda122/llama-server" "-m" "/home/mte90/.tabby/models/TabbyML/Nomic-Embed-Text/ggml/model-00001-of-00001.gguf" "--cont-batching" "--port" "30888" "-np" "1" "--log-disable" "--ctx-size" "4096" "-ngl" "9999" "--embedding" "--ubatch-size" "4096", kill_on_drop: true }: Permission denied (os error 13)
Location: crates/llama-cpp-server/src/supervisor.rs:84

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
⠼    13.948 s   Starting...

Information about your version

0.20.0

Information about your GPU

Thu Nov 14 12:25:09 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060        Off |   00000000:01:00.0  On |                  N/A |
| 32%   40C    P5             17W /  170W |    1098MiB /  12288MiB |     22%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
[...]
+-----------------------------------------------------------------------------------------+

Additional context

So I just downloaded the latest version for cuda and executed with ./tabby serve --model StarCoder-1B --chat-model Qwen2-1.5B-Instruct --device cuda and I see that there is that error but it still proceed.

tabby_x86_64-manylinux2014-cuda122  ls /home/mte90/.tabby/models/TabbyML/Nomic-Embed-Text/ggml/model-00001-of-00001.gguf
Permissions  Size User  Group Date Modified Name
.rw-rw-r--  139Mi mte90 mte90 14 nov 11:50   /home/mte90/.tabby/models/TabbyML/Nomic-Embed-Text/ggml/model-00001-of-00001.gguf

The model has permission and I am executed tabby as normal user (also other models were downloaded but the error is just for this one).

zwpaper commented 23 hours ago

Hi @Mte90, thank you for trying Tabby.

It seems that the llama-server lacks execution permission. Could you try the following and then restart tabby:

chmod +x llama-server

We have fixed this issue, but it hasn't been released yet. The fix will be included in the next release.

Mte90 commented 18 hours ago

I confirm that this fixes :-D

TabbyML / tabby

Permission denied (os error 13) with linux standalone #3425