TianduoWang / DiffAug

[EMNLP 2022] Differentiable Data Augmentation for Contrastive Sentence Representation Learning. https://arxiv.org/abs/2210.16536
MIT License
36 stars 2 forks source link

Question about No unsupervised representation learning experiment #2

Open bigheiniu opened 1 year ago

bigheiniu commented 1 year ago

Hi Tianduo, I really appreciated your work in developing the learnable data augmentation for sentence representation learning. Your proposed method DiffAug has shown really good performance in semi-supervised and supervised settings.

However, I was wondering how is the performance of DiffAug on unsupervised settings.

TianduoWang commented 1 year ago

Hi Yichuan,

Thanks for your question!

In our preliminary experiments, we did try to use unsupervised learning objectives (e.g., MLM), but the final performance is not satisfying.

For your question that whether it is possible to do contrastive learning twice (one for prefix-tuning, the other for joint tuning), I suggest you may read this paper. The idea is quite relevant to yours.

I believe it is interesting and worthwhile to explore whether we can train a data augmentation module (e.g., prefix) with only unsupervised data. As we suggested in our paper, making positive pairs meaningfully different is a promising way to improve the performance of contrastive learning.