huggingface / text-embeddings-inference

A blazing fast inference solution for text embeddings models
https://huggingface.co/docs/text-embeddings-inference/quick_tour
Apache License 2.0
2.66k stars 166 forks source link

Model is downloaded each time I run the container #314

Open djanito opened 3 months ago

djanito commented 3 months ago

System Info

text-embedding-inference:1.3.0

Information

Tasks

Reproduction

  1. Use the Docker script provided
  2. Run the container so it download the model
  3. Re-run the container to see the model be redownloaded

Expected behavior

and I want that the model is downloaded once and then just use the already downloaded model. However, I'm getting my model downloaded each time I run the container: image

Here is my docker-compose configuration: ` reranker: image: ghcr.io/huggingface/text-embeddings-inference:86-1.3.0 restart: always ports:

volumes: model_cache_huggingface: `

Is it normal because even when using MODEL_ID=/data/${RERANKER_MODEL:-}, I got an error downloading the model.

OlivierDehaene commented 3 months ago

The log message is a bit misleading here: since it "downloaded" the model in less than a millisecond we can safely assume that the model was not in fact downloaded but that it re-used the cache. I will change the log message.