AnswerDotAI / rerankers

A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
Apache License 2.0
784 stars 40 forks source link

feat: add inputs template for T5 rankers #16

Closed marcospiau closed 3 months ago

marcospiau commented 3 months ago

Motivation

Hi, first of all, thanks for your work on this library.

I'm submitting this PR because I will soon publish new MonoT5 rerankers for the Portuguese language, and the ease of use of this library would be very beneficial for users. I trained my models using the translated template "Pergunta: {query} Documento: {text} Relevante:", which requires a small modification to the original code.

I tried to include minimal modifications for my specific case, but let me know if this is out of scope for the library or too specific for my case.

Changes

  1. Template Specification for T5 Rankers: Adds the ability to specify the template used for T5 rankers. If no template is specified, it defaults to "Query: {query} Document: {text} Relevant:". (feature)
  2. Default Value for false_token: Sets a default value for false_token if the model is not present in PREDICTION_TOKENS (fix).

PS.: this PR fixes #17

marcospiau commented 3 months ago

Awesome @bclavie! Thanks!