Open m0wer opened 1 month ago
Hi - can you share the output of nvidia-smi
when starting tabby? (or ollama with the model)
Sure!
Inside the container looks empty:
# docker exec -it tabby-tabby-1 nvidia-smi
Sat Sep 28 09:35:34 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3090 On | 00000000:07:00.0 Off | N/A |
| 56% 58C P2 129W / 280W | 14269MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
Outside there are other unrelated stuff in the 2060 and the tabby processes in the 3090:
Sat Sep 28 11:36:14 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2060 On | 00000000:06:00.0 Off | N/A |
| 34% 46C P2 37W / 128W | 1819MiB / 6144MiB | 5% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce RTX 3090 On | 00000000:07:00.0 Off | N/A |
| 56% 58C P2 129W / 280W | 14556MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 144128 C frigate.detector.tensorrt 246MiB |
| 0 N/A N/A 144560 C ffmpeg 182MiB |
| 0 N/A N/A 1832724 C ffmpeg 182MiB |
| 0 N/A N/A 1900156 C ffmpeg 156MiB |
| 0 N/A N/A 1900946 C ffmpeg 156MiB |
| 0 N/A N/A 2774390 C ffmpeg 182MiB |
| 0 N/A N/A 2807270 C ffmpeg 182MiB |
| 0 N/A N/A 3208761 C ffmpeg 182MiB |
| 0 N/A N/A 3873962 C ffmpeg 172MiB |
| 0 N/A N/A 4073311 C ffmpeg 172MiB |
| 1 N/A N/A 154641 C /opt/tabby/bin/llama-server 284MiB |
| 1 N/A N/A 2544258 C ...unners/cuda_v12/ollama_llama_server 13442MiB |
| 1 N/A N/A 4082688 C /opt/tabby/bin/llama-server 818MiB |
+---------------------------------------------------------------------------------------+
In the container logs with debugging enabled:
2024-09-28T09:37:05.162980Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.162988Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163051Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163058Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163106Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163111Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163164Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163170Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163229Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163237Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163303Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163311Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163370Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163375Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163427Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163433Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163489Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163495Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163550Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163557Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163617Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163624Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163690Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163698Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163760Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163767Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163831Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163840Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163897Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163902Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163957Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:634: connected to 127.0.0.1:30889
Describe the bug DeepSeek-Coder-V2-Lite is stuck at starting.
The web server does not even satrt after 10,000 s of “Starting“. Other models work fine. If within the container I kill the llama-server process and start it manually without --disable-log, it can work and the web server starts and provides completions.
If I then kill the manually started ollama-server process, the default one is able to spawn and be loaded to VRAM but it just doesn't reply to requests.
Information about your version
Happens with 0.16.1, 0.17.0 and 0.18.0-rc4.
Information about your GPU
NVIDIA GeForce RTX 3090
Additional context
Running a fresh instance with docker compose:
There's nothing relevant in the logs, just request logs but no errors.