which approach to go in for - crossencoder vs bi-encoder

Hi, thank you for creating such an awesome library.

My question:- I have a task at hand wherein I have abbreviation, sense, text

example AB, abortion, The patient does have a known history of having had a missed AB

these abbreviations can have multiple senses for example 'AB' can mean 'abortion' but can also mean 'blood group in ABO system'

I have multiple such abbreviations with each abbreviation having multiple such senses. In such a scenario if i have to predict what sense is the abbreviation about given the full text. What should i be using ?

If i use a cross encoder then that'll mean treating it as a sentence pair classification task and passing the whole pair through bert at once multiple times and comparing the full text with each and every sense of the abbreviation like

for abbreviation AB and sentence patient does have a known history of having had a missed AB sentence1 sentence2 label patient does have a known history of having had a missed AB, abortion, 1 patient does have a known history of having had a missed AB, blood group in ABO system, 0

I am not able to justify ^ when it comes to scale

I can go ahead with bi encoder approach where in I train a sentencetransformer model using https://www.sbert.net/docs/package_reference/losses.html#multiplenegativesrankingloss (since i have only positive pairs) and https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/nli/training_nli_v2.py

and at time of inference given the sentence and abbreviation I can just compute the similarity b/w sentence and each sense of abbreviation (storing the sense vectors beforehand) and return the one with max cosine similarity

which approach will be better off given the above scenario ? Thank you in advance!

UKPLab / sentence-transformers

which approach to go in for - crossencoder vs bi-encoder #1707