We take the checkpoints of these models every 10 steps during training and visualize the alignment and uniformity metrics in Figure 2, along with a simple data augmentation model “delete one word”. As is clearly shown, all models largely improve the uniformity. However, the alignment of the two special variants also degrades drastically, while our unsupervised SimCSE keeps a steady alignment, thanks to the use of dropout noise. On the other hand, although “delete one word” slightly improves the alignment, it has a smaller gain on the uniformity, and eventually underperforms unsupervised SimCSE.
将对比学习应用到句子表示任务当中。使用dropout作为数据增强的手段来构造正负例,即将同一个句子送入model中两次,由于dropout是随机的,则句子最终的表示发生变化,但是句子实际的语义是不变的。
信息
1 学习到的新东西:
2 通过Related Work了解到了哪些知识
其实对于解决BERT的向量坍缩的问题,已经有了一些工作。 1 2 3
3 实验验证任务,如果不太熟悉,需要简单描述
4 在你认知范围内,哪些其它任务可以尝试
目前的对比学习都是在样本级别上构造正负例,即样本和扰动后的自己构成正例。可以考虑在跨语言跨领域任务上,在领域和语言这个级别上面构造正例,然后看看能不能学习到一些领域或者语音相关的特征。(不太成熟的想法,没有仔细想过)
5 好的句子