UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.22k stars 2.47k forks source link

Problem About domain transfer with Augmented SBERT #670

Open svjack opened 3 years ago

svjack commented 3 years ago

I review the code located in train_sts_qqp_crossdomain.py The idea is to use cross_encoder's higher performance. So for unlabeled data, is it feasible to combine multi sample methods with cross_encoder's prediction and combine them with snorkel: https://github.com/snorkel-team/snorkel ? Do you have some advice for the design of label function about this labeling problem?

nreimers commented 3 years ago

I never used snorkel, so I cannot help there