izhx / paper-reading

7 stars 2 forks source link

SimCSE: Simple Contrastive Learning of Sentence Embeddings #12

Open guopeiming opened 3 years ago

guopeiming commented 3 years ago



1 学习到的新东西:

  1. 衡量对比学习向量表示的两个指标Alignment和uniformity,前者是正例间的相似程度,后者是负例见的差异程度。用着两个做实验分析很不错。
  2. 用dropout做数据增强,简单有效,详细见上方
  3. BERT的向量表示存在一个坍缩的问题,英文表述叫做representation degeneration或者Anisotropy。主要意思就是说大部分的向量表示都集聚在向量空间的某一部分,而不是分散在广阔的语义向量空间中,这极大的限制了其可能的表示性。

2 通过Related Work了解到了哪些知识

其实对于解决BERT的向量坍缩的问题,已经有了一些工作。 1 2 3

3 实验验证任务,如果不太熟悉,需要简单描述

  1. 除一些常规的消融研究外,就是第一部分介绍的alignment和uniformity的可视化展示比较不错。

4 在你认知范围内,哪些其它任务可以尝试


5 好的句子

  1. We take the checkpoints of these models every 10 steps during training and visualize the alignment and uniformity metrics in Figure 2, along with a simple data augmentation model “delete one word”. As is clearly shown, all models largely improve the uniformity. However, the alignment of the two special variants also degrades drastically, while our unsupervised SimCSE keeps a steady alignment, thanks to the use of dropout noise. On the other hand, although “delete one word” slightly improves the alignment, it has a smaller gain on the uniformity, and eventually underperforms unsupervised SimCSE.