kongds / Prompt-BERT

PromptBERT: Improving BERT Sentence Embeddings with Prompts
334 stars 33 forks source link

What if I augment the positive examles in promptbert? #27

Closed leoozy closed 1 year ago

leoozy commented 1 year ago

I tired to aug the positive examples using back-translation with different prompt (the prompt used in unsupervised roberta) . Specifically, I got two views of the positive examples using back-translation. Then I feed them into promptbert, but I found that the avg spearman score for roberta base is only 75. (~79.2 for your paper). I also tired to only use the back-translation and I got an avg spearman score of 77. I am confused that why the prompt do not work for augmented positive data. Do you have some ideas?

I also found that in the supervised setting, using different prompt will hurt the performance. Does this means that your method only works for positive pairs with the same length? Thank you!

kongds commented 1 year ago

Hello

  1. Maybe back-translation is not work on sentence embeddings with contrastive learning. You can also try back-translation with SimCSE to see whether it work. For example, ConSERT uses many data augment methods to produce positive example, but it still underperform SimCSE. ConSERT although mention the back-translation:

    image
  2. I think the reason is not different length. For supervised setting, we can directly use different sentences as positive pair, which is better than positive pair from the different prompts.

leoozy commented 1 year ago

Thank for your rapid reply. Have you tired to use different prompts in the supervised setting? I can get the results reported in your code (about 82.5%), but if I use different prompts, the performance will drop to 79%. I am really confused about it.

kongds commented 1 year ago

I have tried it with different prompts. The performance is 82.03. You can get this result by adding following code in run.sh and run bash run.sh sup-roberta-dp

"sup-roberta-dp")
    BC=(python -m torch.distributed.launch --nproc_per_node 4 train.py)
    TRAIN_FILE=data/nli_for_simcse.csv
    BATCH=128
    EPOCH=3
    LR=5e-5
    MODEL=roberta-base
    TEMPLATE="*cls*_This_sentence_:_'_*sent_0*_'_means*mask*.*sep+*"
    TEMPLATE2="*cls*_The_sentence_:_'_*sent_0*_'_means*mask*.*sep+*"
    args=(--mask_embedding_sentence\
          --mask_embedding_sentence_template $TEMPLATE\
          --mask_embedding_sentence_different_template $TEMPLATE2\
          --mask_embedding_sentence_delta)
    eargs=(--mask_embedding_sentence_use_pooler\
           --mask_embedding_sentence_delta \
           --mask_embedding_sentence \
           --mask_embedding_sentence_template $TEMPLATE )
;;