huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.22k stars 220 forks source link

Hard Negative Mining vs random sampling #349

Open vahuja4 opened 1 year ago

vahuja4 commented 1 year ago

Has anyone tried doing hard negative mining when generating the sentence pairs as opposed to random sampling? @tomaarsen - is random sampling the default?

tomaarsen commented 1 year ago

Random sampling for the negative pairs is the default, yes. My understanding is that this is a relatively hard to beat baseline. @danielkorat has done some research on different sampling approaches, and I believe he found that some of the seemingly clever sampling approaches were beaten by simple random sampling. However, I think he also found that there are some improvements to be made over purely random sampling.

I don't recall exactly if he tried finding hard negatives, but perhaps he can elaborate himself a bit, if he finds the time.

adfindlater commented 1 year ago

I was wondering something similar. I have a n-class case where some of the classes will likely already be well separated in the un-tuned embedding space. It would be nice to bias sampling towards the pairs where I know a priori there is likely to be confusion in the downstream classification task.