michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip
https://michaelfeil.github.io/infinity/
MIT License
1.31k stars 96 forks source link

Load local model #204

Closed jmoney closed 5 months ago

jmoney commented 5 months ago

Is there a way to override where to download / look for local models? I see the argument --model-name-or-path but when i specify the string /ai-model-cache it complains that it's not a valid model name. I'm just trying to point it where to look for pre-downloaded models and then using --served-model-name to actually do the picking.

If this is not a supported feature then please let me know.

michaelfeil commented 5 months ago

@jmoney --model-name-or-path /home/michael/hf_cache/bge-small-15-custom

Please verify that you indeed downloaded a model at that location, e.g. in bash

$ [ -f "/home/michael/hf_cache/bge-small-15-custom/model.safetensors" ] && echo "File exists." || echo "File does not exist."

--served-model-name is just the nickname of the model. It does not modify loading behavior.