Closed maximilianwerk closed 4 years ago
Important to keep in mind, that Rankers should be chainable to allow different phases of ranking
➤ Nan Wang commented:
BertQA is an extractive QA model, which extract the answer from the context text. This is not extract what we want.
Describe the feature
Currently jina only supports ranking matches based on the scores, that the retrieval step provides. Adding the possibility to add more query <> match metrics in order to fine-tune the ranking is needed. Possible applications are a simple edit distance or complex deep learning scoring techniques as BERT.
Furthermore, it might be necessary, to add a
WeightedRanker
, which takes multiple scores encoded in thescore.operands
field and combines them to one top-level score per match.Proposal
ContentMatchDriver
).ContentMatcher
). This should be configurable to either overwrite the match score or add a score in thescore.operands
field.ContentMatcher
in the form of a simple Levenshtein distance in the hub (namedLevenshteinMatcher
).ContentMatcher
in the form of a BERT scoring in the hub (named `BertMatcher´).Adding a
WeightedRanker
as a consecutive step might be necessary. This could be a simple linear-combination of the existing scores or something like lambda-mart in the long run. Anyhow, I would rather add this as a consecutive task, to not overload this issue.┆Issue is synchronized with this Jira Task by Unito