UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT
https://www.SBERT.net
Apache License 2.0
14.31k stars 2.39k forks source link

Can this be called bi-encoder? Train two models, one for question, one for paragraph #1904

Open PhilipMay opened 1 year ago

PhilipMay commented 1 year ago

Hi,

you provide an example to train a bi-encoder:

https://github.com/UKPLab/sentence-transformers/blob/267c5f96bfc07eeba9aaf4394233200558c08f95/examples/unsupervised_learning/query_generation/2_programming_train_bi-encoder.py#L10

Can this be called bi-encoder? I think bi-encoder is when you use two models. One for question encoding and one for paragraph encoding. Do you have a training example where you train two models like this?

HenryL27 commented 1 year ago

A bi-encoder is typically a single model, which can map both queries and passages (or single sentences or whatever) to a shared vector space where 'sentence similarity' can be represented by a concrete distance measure like cosine or dot-product. Theoretically I think you could train two models at once, but I don't really see the benefit - ultimately we just want a function that sends sentences to this vector space; having two seems redundant (yes, they might be more specialized since they were trained specifically on questions or on paragraphs, but remember that paragraphs and questions can come in many many forms, so specialization may not be what you want). Anyway, take all this with a massive grain of salt; I'm relatively new to the space.

To train two models (for q's and p's), I think you'd need to write your own loss function. Again, I'm not the library author so I can't guarantee that, but I don't see anything that does this off the bat.