huggingface / text-embeddings-inference

A blazing fast inference solution for text embeddings models
https://huggingface.co/docs/text-embeddings-inference/quick_tour
Apache License 2.0
2.84k stars 178 forks source link

DriverError(CUDA_ERROR_NO_DEVICE, "no CUDA-capable device is detected") #208

Open momomobinx opened 8 months ago

momomobinx commented 8 months ago

System Info

gpu: GTX 3090

Information

Tasks

Reproduction

1.

docker run \
    -d \
    --name embedding \
    --gpus '"device=2"' \
    --env CUDA_VISIBLE_DEVICES=2 \
    -p 7862:80 \
    -v $(pwd):/data \
    --entrypoint "sleep infinity" \
    ghcr.io/huggingface/text-embeddings-inference:86-1.1 \
    --model-id "/data/bge-large-zh" 
  1. docker logs -f embedding
    2024-03-20T10:07:36.394366Z  INFO text_embeddings_router: router/src/main.rs:122: Args { model_id: "/dat*/***-****e-zh", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, hf_api_token: None, hostname: "ecb590a6a50b", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), json_output: false, otlp_endpoint: None, cors_allow_origin: None }
    2024-03-20T10:07:36.402419Z  INFO text_embeddings_router: router/src/lib.rs:166: Maximum number of tokens per request: 512
    2024-03-20T10:07:36.404038Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:23: Starting 32 tokenization workers
    2024-03-20T10:07:36.470736Z  INFO text_embeddings_router: router/src/lib.rs:191: Starting model backend
    thread 'main' panicked at /root/.cargo/git/checkouts/cudarc-2602ad613d9c0487/c388e72/src/driver/safe/core.rs:54:24:
    called `Result::unwrap()` on an `Err` value: DriverError(CUDA_ERROR_NO_DEVICE, "no CUDA-capable device is detected")
  2. docker run -it --rm --gpus '"device=2"' --env CUDA_VISIBLE_DEVICES=2 --entrypoint "sh" ghcr.io/huggingface/text-embeddings-inference:86-1.1 -c "nvidia-smi"
  3. image

Expected behavior

normal running

OlivierDehaene commented 8 months ago

Remove --env CUDA_VISIBLE_DEVICES=2

momomobinx commented 8 months ago

3. --env CUDA_VISIBLE_DEVICES=2

thread 'main' panicked at /root/.cargo/git/checkouts/cudarc-2602ad613d9c0487/c388e72/src/driver/safe/core.rs:54:24: called Result::unwrap() on an Err value: DriverError(CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE, "forward compatibility was attempted on non supported HW")