michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip
https://michaelfeil.github.io/infinity/
MIT License
1.32k stars 97 forks source link

422 error if /embeddings input is a string #98

Closed OlegIvaniv closed 7 months ago

OlegIvaniv commented 7 months ago

Hello,

There seems to be a discrepancy between Infinity's /embeddings API endpoint and OpenAI. OpenAI supports both string and string[] as input while Infinity throws 422: Unprocessed Entity for a simple string input. This prevents us from being able to use Infinity endpoint with Langchain's Open AI embeddings as embedQuery is sending a string input.

Any chance this could be updated to support both, in line with the OpenAI spec?

Appreciate your work!

Oleg

michaelfeil commented 7 months ago

Hey Oleg,

You‘re right, OpenAI accepts different formats batches (well done), single sentences (debatable), and tokens (pretokenized and sent as ids, imho a design error, they will never be able to update their tokenizer).

I wrote a infinity integration into Python Langchain. https://github.com/langchain-ai/langchain/tree/master/libs/community/langchain_community/embeddings

looking into your syntax, are you using JS/TS?

michaelfeil commented 7 months ago

@OlegIvaniv This is now fixed on the main branch with the latest pydantic upgrades. #99

OlegIvaniv commented 7 months ago

@OlegIvaniv This is now fixed on the main branch with the latest pydantic upgrades. #99

Works like a charm. Thank you for such a quick response!