michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
https://michaelfeil.eu/infinity/
MIT License
959 stars 71 forks source link

add embedding quantization interface #247

Closed michaelfeil closed 3 weeks ago