UKPLab / sentence-transformers

Multilingual Sentence & Image Embeddings with BERT
https://www.SBERT.net
Apache License 2.0
14.31k stars 2.39k forks source link

Annotation for Information Retrieval #1886

Open tin9580 opened 1 year ago

tin9580 commented 1 year ago

I am planning to use S-Bert for asymmetric information retrieval purposes with highly technical data. Is there a best practice regarding how to manually annotate the data for fine-tuning? By annotation I mean to write the queries in the pair (query, document) for Multiple Negatives Ranking Loss.

I guess the query should be as close to the document paragraph as possible but of course not a copy-paste of it.

molo6379 commented 5 months ago

@nreimers any update on this issue? I would appreciate a good example for Asym.py training as well.. Like which loss and evaluator kinds of examples.