michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
https://michaelfeil.github.io/infinity/
MIT License
1.06k stars 75 forks source link

Adding torch.compile + fp16 + bettertransformer a CLI argument #122

Closed michaelfeil closed 4 months ago

michaelfeil commented 4 months ago

Proposal:

Add torch.compile: bool, dtype: Enum and bettertransformer: bool to EngineArgs

Enum, dtype: