huggingface / setfit

Efficient few-shot learning with Sentence Transformers
https://hf.co/docs/setfit
Apache License 2.0
2.13k stars 217 forks source link

Using Setfit for similarity classification #91

Open castafra opened 1 year ago

castafra commented 1 year ago

Hello, I would like to test this promising framework on a similarity classification task. So basically, I have got a dataset with 3 columns: (sentence1,sentence2,label). From what I understand, currently it is only possible to train on a single sentence classification problem. Is there a get around to use Setfit for a pair sentence classification problem ? If not, would it be possible to add this feature in a future integration ?

Thank you in advance

lewtun commented 1 year ago

Hi @castafra thanks for your interest in our work! I know @orenpereg ran some experiments on 2-sentence tasks like GLUE, but it wasn't entirely clear how one should create the text/label triples one needs for the contrastive learning step.

Perhaps he can share some more details on what worked / didn't work :)

castafra commented 1 year ago

Hi @orenpereg , could you share any feedbacks on your experiments ? Thanks

orenpereg commented 1 year ago

Hi @castafra, Thanks for your question. Currently Setfit does not support two sentences similarity classification tasks and it's not trivial to apply SetFit for that. It will probably require changes within the Sentence Transformer itself. We do have plans to extend SetFit and one of the options is to examine similarity classification task. Having said that, i did play a bit with it, as a workaround, I simply concatenated sentence1 and sentence2 and input pairs of 2 concatenated sentences to the contrastive training process. The results weren't so good. SetFit was on-par with the baseline model (a 'standard' BERT cross encoder) when using few-shot data (8,16,32 samples). The bottom line is that this workaround is not recommended.

jpzhangvincent commented 1 year ago

I also work on the similarity classification task. I think it would be great to extend the framework for that task. It seems setFit uses the underlying sentence_transformer library, right? FWIW, the example shows how to use sentence_transformer for similarity learning - https://www.sbert.net/examples/training/sts/README.html . I'm wondering whether we just need to implement a sklearn-learn classifier model to take two fine-tuned embeddings as inputs to predict the label.

Other references:

  1. https://keras.io/examples/vision/metric_learning_tf_similarity/
  2. https://keras.io/examples/vision/siamese_network/
rjurney commented 3 months ago

Hi @castafra, Thanks for your question. Currently Setfit does not support two sentences similarity classification tasks and it's not trivial to apply SetFit for that. It will probably require changes within the Sentence Transformer itself. We do have plans to extend SetFit and one of the options is to examine similarity classification task. Having said that, i did play a bit with it, as a workaround, I simply concatenated sentence1 and sentence2 and input pairs of 2 concatenated sentences to the contrastive training process. The results weren't so good. SetFit was on-par with the baseline model (a 'standard' BERT cross encoder) when using few-shot data (8,16,32 samples). The bottom line is that this workaround is not recommended.

This is a bummer :( I was so excited that I could few shot improve postal address comparisons... :D