michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip
https://michaelfeil.github.io/infinity/
MIT License
1.31k stars 96 forks source link

Device = None #358

Closed wolfassi123 closed 2 weeks ago

wolfassi123 commented 2 weeks ago

System Info

Using the following docker image michaelf34/infinity:0.0.55

Information

Tasks

Reproduction

I used the followed command to build my container as it was stated in the github repo:

port=7997
model1=mixedbread-ai/mxbai-rerank-large-v1
volume=$PWD/data

docker run --name infinity_mixed -it --gpus all \
 -v $volume:/app/.cache \
 -p $port:$port \
 -e CUDA_VISIBLE_DEVICES=0 \
 michaelf34/infinity:latest \
 v2 \
 --model-id $model1 \
 --port $port

Now when I am running the container, I am getting infinity_emb INFO: model="mixedbread-ai/mxbai-rerank-large-v1" selected, using engine="torch"and device="None"

When I check nvidia-smi I notice that the container is actually attached to the GPU, yet the logs are stating that no device is attached.

Expected behavior

Ideally, the log should return the device that I am actually connected to, in my case, device:0.

Is there a bug anywhere or am I running something wrong?

michaelfeil commented 2 weeks ago

Device: None is printed by the package https://github.com/UKPLab/sentence-transformers and translates to auto which uses cuda:0 by default.

You did nothing wrong, you could enforce --device cuda.