michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip
https://michaelfeil.github.io/infinity/
MIT License
1.31k stars 96 forks source link

docker compose with folder with models #215

Closed shuther closed 4 months ago

shuther commented 5 months ago

Feature request

  1. Allow the models to be picked up from/downloaded into a folder.
  2. Propose a maintained docker compose with this implementation (this folder should be mapped to the host)
  3. Propose some curl examples to call the API I am confused by HF_HOME, --model-name-or-path but it already exists

Motivation

Maintenance of the model library would be easier if the models would not be part of the Docker (save space and restart)

Your contribution

difficult for now

michaelfeil commented 4 months ago

Thanks for the feedback. I'll consider a docker run command that maps the huggingface cache and adjacent files. The download should however be managed by infinity.

I don't think docker-compose is a contemporary good approach, and I am rejecting to maintain a docker-compose for now, until I see very good arguments. I'd rather have a k8s template.