michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
https://michaelfeil.eu/infinity/
MIT License
975 stars 72 forks source link

Safetensors or to be sure not to load pickled weights #164

Closed wllhf closed 3 months ago

wllhf commented 3 months ago

I'm really grateful for infinity! Thanks a lot! We are thinking about using it in a project with high security demands. Is it possible somehow to only load models via safetensors (or other safe formats) and exclude loading pickled weights? Is there a way knowing which format was used for loading when starting up a model via Infinity?

michaelfeil commented 3 months ago

@wllhf Thanks for the positive feedback.

It inherits the behaviour of "SentenceTransformers", it will automatically filter certain files, and AFAIK not download the jaxformers weights / rustformers etc. Also the default behaviour is --trust_remote_code=True in the infinity cli. I think if you were to modify the safetensors only - you would shrink your attack vector against various other man-in-the-middle-attacks only slightly. Its only useful for slightly increased security when your testing our 10s/100s of models.

Here are 2 security suggestions from a MLE, if the project is a high security one:

michaelfeil commented 3 months ago

@wllhf Does this solve the question for you?

wllhf commented 3 months ago

Yes. Thanks!