michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip
https://michaelfeil.github.io/infinity/
MIT License
1.32k stars 97 forks source link

Refactor batching to os.fork / multiprocessing #57

Open michaelfeil opened 9 months ago

michaelfeil commented 9 months ago

This is a draft PR - unlikley to get merged. The performance overhead for inter-processes communication is too high.