Open QIN2DIM opened 9 months ago
Describe the bug
Docker Compose, CUDA error 711 at /root/workspace/crates/llama-cpp-bindings/llama.cpp/ggml-cuda.cu:6826: peer mapping resources exhausted
CUDA error 711 at /root/workspace/crates/llama-cpp-bindings/llama.cpp/ggml-cuda.cu:6826: peer mapping resources exhausted
Information about your version
tabbyml/tabby:latest 23bdb48b7956 2 weeks ago
Information about your GPU
Additional context
version: '3.5' services: tabby: restart: always image: tabbyml/tabby command: serve --model TabbyML/DeepseekCoder-6.7B --device cuda volumes: - "$HOME/.tabby:/data" environment: NVIDIA_VISIBLE_DEVICES: all HTTPS_PROXY: http://127.0.0.1:2081 HTTP_PROXY: http://127.0.0.1:2081 ports: - 9999:8080 deploy: resources: reservations: devices: - driver: nvidia count: 9 # Setting it to 'all' or 10 will cause an error, otherwise everything will be normal. capabilities: [gpu]
Since tabby only utilize single GPU - could try passing a single gpu device (e.g device 0) to docker container and try if it work?
Describe the bug
Docker Compose,
CUDA error 711 at /root/workspace/crates/llama-cpp-bindings/llama.cpp/ggml-cuda.cu:6826: peer mapping resources exhausted
Information about your version
Information about your GPU
Additional context