jina-ai / serve

☁️ Build multimodal AI applications with cloud-native stack
https://jina.ai/serve
Apache License 2.0
21.13k stars 2.22k forks source link

Enable post-retrieval scoring of document-match pairs #891

Closed maximilianwerk closed 4 years ago

maximilianwerk commented 4 years ago

Describe the feature

Currently jina only supports ranking matches based on the scores, that the retrieval step provides. Adding the possibility to add more query <> match metrics in order to fine-tune the ranking is needed. Possible applications are a simple edit distance or complex deep learning scoring techniques as BERT.

Furthermore, it might be necessary, to add a WeightedRanker, which takes multiple scores encoded in the score.operands field and combines them to one top-level score per match.

Proposal

Adding a WeightedRanker as a consecutive step might be necessary. This could be a simple linear-combination of the existing scores or something like lambda-mart in the long run. Anyhow, I would rather add this as a consecutive task, to not overload this issue.

┆Issue is synchronized with this Jira Task by Unito

JoanFM commented 4 years ago

Important to keep in mind, that Rankers should be chainable to allow different phases of ranking

sync-by-unito[bot] commented 4 years ago

➤ Nan Wang commented:

BertQA is an extractive QA model, which extract the answer from the context text. This is not extract what we want.