michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip
https://michaelfeil.github.io/infinity/
MIT License
1.31k stars 96 forks source link

List should have at most 2048 items after validation: Context Length Error #338

Closed TimilsinaBimal closed 1 month ago

TimilsinaBimal commented 1 month ago

System Info

0.0.53, Ubuntu 20.04

Information

Tasks

Reproduction

  1. Pass any text that exceeds hard limit of 2048 words.

Expected behavior

My model i.e. Alibaba-NLP/gte-multilingual-base has context length of 8192, and I know my text can exceed 2048 limit sometimes. But Why would it restrict me to 2048 if I can pass more than that? Shouldn't it be configurable? It would have been better if we could pass the context length of model.

michaelfeil commented 1 month ago

https://github.com/michaelfeil/infinity/blob/2271735e56fa98fb4fe774c1e2b4bd98471a4815/libs/infinity_emb/infinity_emb/fastapi_schemas/pymodels.py#L32

You can send around 15*8192 characters (measured in ascii / utf-8, not tokens)

2048 is the number of documents per request, and http protocol has some limits if you go much higher.