TabbyML / tabby

Self-hosted AI coding assistant
https://tabbyml.com
Other
21.87k stars 996 forks source link

DeepSeek-Coder-V2-Lite stuck #3212

Open m0wer opened 1 month ago

m0wer commented 1 month ago

Describe the bug DeepSeek-Coder-V2-Lite is stuck at starting.

The web server does not even satrt after 10,000 s of “Starting“. Other models work fine. If within the container I kill the llama-server process and start it manually without --disable-log, it can work and the web server starts and provides completions.

If I then kill the manually started ollama-server process, the default one is able to spawn and be loaded to VRAM but it just doesn't reply to requests.

Information about your version

Happens with 0.16.1, 0.17.0 and 0.18.0-rc4.

Information about your GPU

NVIDIA GeForce RTX 3090

Additional context

Running a fresh instance with docker compose:

services:
  tabby:
    image: tabbyml/tabby:latest
    restart: unless-stopped
    deploy:
      resources:
        limits:
          cpus: '5'
          memory: 30G
        reservations:
          devices:
            - driver: nvidia
              #count: all
              device_ids: ['1']
              capabilities:
                - gpu
    labels:
      traefik.enable: true
      traefik.http.routers.tabby.rule: Host(`domain`)
      traefik.http.routers.tabby.entrypoints: websecure
      traefik.http.services.tabby.loadbalancer.server.port: 8080
    volumes:
      - /data/tabby/data/:/data
    command: ["serve", "--model", "DeepSeek-Coder-V2-Lite", "--device", "cuda"]
    environment:
      RUST_LOG: debug

There's nothing relevant in the logs, just request logs but no errors.

wsxiaoys commented 1 month ago

Hi - can you share the output of nvidia-smi when starting tabby? (or ollama with the model)

m0wer commented 1 month ago

Sure!

Inside the container looks empty:

# docker exec -it tabby-tabby-1 nvidia-smi
Sat Sep 28 09:35:34 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3090        On  | 00000000:07:00.0 Off |                  N/A |
| 56%   58C    P2             129W / 280W |  14269MiB / 24576MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

Outside there are other unrelated stuff in the 2060 and the tabby processes in the 3090:

Sat Sep 28 11:36:14 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 2060        On  | 00000000:06:00.0 Off |                  N/A |
| 34%   46C    P2              37W / 128W |   1819MiB /  6144MiB |      5%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce RTX 3090        On  | 00000000:07:00.0 Off |                  N/A |
| 56%   58C    P2             129W / 280W |  14556MiB / 24576MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A    144128      C   frigate.detector.tensorrt                   246MiB |
|    0   N/A  N/A    144560      C   ffmpeg                                      182MiB |
|    0   N/A  N/A   1832724      C   ffmpeg                                      182MiB |
|    0   N/A  N/A   1900156      C   ffmpeg                                      156MiB |
|    0   N/A  N/A   1900946      C   ffmpeg                                      156MiB |
|    0   N/A  N/A   2774390      C   ffmpeg                                      182MiB |
|    0   N/A  N/A   2807270      C   ffmpeg                                      182MiB |
|    0   N/A  N/A   3208761      C   ffmpeg                                      182MiB |
|    0   N/A  N/A   3873962      C   ffmpeg                                      172MiB |
|    0   N/A  N/A   4073311      C   ffmpeg                                      172MiB |
|    1   N/A  N/A    154641      C   /opt/tabby/bin/llama-server                 284MiB |
|    1   N/A  N/A   2544258      C   ...unners/cuda_v12/ollama_llama_server    13442MiB |
|    1   N/A  N/A   4082688      C   /opt/tabby/bin/llama-server                 818MiB |
+---------------------------------------------------------------------------------------+

In the container logs with debugging enabled:

2024-09-28T09:37:05.162980Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.162988Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163051Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163058Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163106Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163111Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163164Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163170Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163229Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163237Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163303Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163311Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163370Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163375Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163427Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163433Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163489Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163495Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163550Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163557Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163617Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163624Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163690Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163698Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163760Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163767Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163831Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163840Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163897Z DEBUG reqwest::connect: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/reqwest-0.12.4/src/connect.rs:497: starting new connection: http://127.0.0.1:30889/
2024-09-28T09:37:05.163902Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:631: connecting to 127.0.0.1:30889
2024-09-28T09:37:05.163957Z DEBUG hyper_util::client::legacy::connect::http: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-util-0.1.5/src/client/legacy/connect/http.rs:634: connected to 127.0.0.1:30889