michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
https://michaelfeil.github.io/infinity/
MIT License
1.06k stars 75 forks source link

Return actual token count on forward pass #92

Closed michaelfeil closed 22 hours ago

michaelfeil commented 5 months ago

Returning the actual token count that are used after truncating.

michaelfeil commented 5 months ago

89

michaelfeil commented 22 hours ago

completed --lengths-via-tokenize