Closed shizidushu closed 6 days ago
@shizidushu Do you mean jina?
The bot's issues above are all three correct points. I think a change does not make sense here, but thank you for your contribution!
https://docs.cohere.com/reference/rerank and huggingface TEI set default=False for return documents. https://huggingface.github.io/text-embeddings-inference. The reason for it, is to save bandwidth by default.
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 77.79%. Comparing base (
6c1ad68
) to head (cd90643
).
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@shizidushu Do you mean jina?
The bot's issues above are all three correct points. I think a change does not make sense here, but thank you for your contribution!
https://docs.cohere.com/reference/rerank and huggingface TEI set default=False for return documents. https://huggingface.github.io/text-embeddings-inference. The reason for it, is to save bandwidth by default.
@michaelfeil
Yes. It is Jina. Sorry for the typo.
Thanks for your explanation. Yes, it is ok to set default=False for return documents.
But I still suggest to follow the convention of cohere and jina for the data structure of document:
Fair point - I interpreted the API from cohere incorrectly. Jina released their rerank api later, afaik.
Potentially, it would make more sense to align the API spec with Text-embedding-inference. https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/rerank
In this case making an alias for return_text and https://docs.pydantic.dev/latest/concepts/alias/#aliaspath-and-aliaschoices
return_text: str = Field(validation_alias=AliasChoices('return_text', 'return_documents'))
Would that work for your use-case?
@michaelfeil No. In my case, I have to change both default of return documents and use 'text' property, i.e., align the behaviour with jinja. (Otherwise Dify will report errors). Because I need to pretend it is a jina rank api
Dify has also an integration for text-embeddings inference, right? Lets align on TEI's integation. Thanks.
To be more clear: A PR that breaks API comparability is not accepted. Sorry.
@michaelfeil Thanks. I got it. I just tried dify's tei integration. Infinity log reports: "INFO: 172.17.0.1:49130 - "GET /v1/info HTTP/1.1" 404 Not Found"
As described in https://jina.ai/reranker/, jinja rerank
text
property.As current some project (e.g., dify) does not support infinity, let infinity has the same default behavior with jinja is helpful.
API's to consider:
{return_documents: bool = False} {.., documents: str}
{return_documents: bool = False} -> {.., documents: str}
{no_kwarg_for_documents} -> {document: {text: str}}
{return_text: bool = False} -> {.., text: str}