michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
https://michaelfeil.eu/infinity/
MIT License
971 stars 72 forks source link

Loading models from local path #217

Closed vladimirmujagic closed 1 month ago

vladimirmujagic commented 1 month ago

Feature request

Is it really not possible to load models from path instead of huggingface repo?

Motivation

Allows the server to run in environments where downloading from hugging face is not possible

Your contribution

I can help if needed, yes.

michaelfeil commented 1 month ago

@vladimirmujagic Thanks for opening. This is possible, however, there are a lot of files besides the model.safetensors. Only thing I can recommend is to prime your cache, e.g. patching in your model. Does #204 #205 help?

michaelfeil commented 1 month ago

Closeing because of duplicate #205 Comment there.