michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip
https://michaelfeil.github.io/infinity/
MIT License
1.31k stars 96 forks source link

Make infinity rerank more compatible with jina #383

Closed shizidushu closed 6 days ago

shizidushu commented 1 week ago

As described in https://jina.ai/reranker/, jinja rerank

As current some project (e.g., dify) does not support infinity, let infinity has the same default behavior with jinja is helpful.

API's to consider:

michaelfeil commented 1 week ago

@shizidushu Do you mean jina?

The bot's issues above are all three correct points. I think a change does not make sense here, but thank you for your contribution!

https://docs.cohere.com/reference/rerank and huggingface TEI set default=False for return documents. https://huggingface.github.io/text-embeddings-inference. The reason for it, is to save bandwidth by default.

codecov-commenter commented 1 week ago

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 77.79%. Comparing base (6c1ad68) to head (cd90643).

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #383 +/- ## ========================================== + Coverage 77.78% 77.79% +0.01% ========================================== Files 37 37 Lines 2804 2806 +2 ========================================== + Hits 2181 2183 +2 Misses 623 623 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

shizidushu commented 1 week ago

@shizidushu Do you mean jina?

The bot's issues above are all three correct points. I think a change does not make sense here, but thank you for your contribution!

https://docs.cohere.com/reference/rerank and huggingface TEI set default=False for return documents. https://huggingface.github.io/text-embeddings-inference. The reason for it, is to save bandwidth by default.

@michaelfeil

Yes. It is Jina. Sorry for the typo.

Thanks for your explanation. Yes, it is ok to set default=False for return documents.

But I still suggest to follow the convention of cohere and jina for the data structure of document:

图片

图片

michaelfeil commented 1 week ago

Fair point - I interpreted the API from cohere incorrectly. Jina released their rerank api later, afaik.

Potentially, it would make more sense to align the API spec with Text-embedding-inference. https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/rerank

In this case making an alias for return_text and https://docs.pydantic.dev/latest/concepts/alias/#aliaspath-and-aliaschoices

return_text: str = Field(validation_alias=AliasChoices('return_text', 'return_documents'))

Would that work for your use-case?

shizidushu commented 6 days ago

@michaelfeil No. In my case, I have to change both default of return documents and use 'text' property, i.e., align the behaviour with jinja. (Otherwise Dify will report errors). Because I need to pretend it is a jina rank api

图片

michaelfeil commented 6 days ago

Dify has also an integration for text-embeddings inference, right? Lets align on TEI's integation. Thanks.

To be more clear: A PR that breaks API comparability is not accepted. Sorry.

shizidushu commented 6 days ago

@michaelfeil Thanks. I got it. I just tried dify's tei integration. Infinity log reports: "INFO: 172.17.0.1:49130 - "GET /v1/info HTTP/1.1" 404 Not Found"