Unmatched description in the code and in the paper

princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

MIT License

3.36k stars 507 forks source link

Hi,

I found in your evaluation code (i.e., evaluation.py) you computed the score by using the below line (i.e., Line 156)

"scores.append("%.2f" % (results[task]['dev']['spearman'][0] * 100))"

, which means it reports the spearman of dev on the STS-B and SICK-R datasets.

However, in Table 5 of your paper, you said the results are Spearman’s correlation in “all” setting. So I wonder which shall I refer to?

There are "ALL", "ALL (weighted average)", "ALL (average)" in the returned results from SentEval. I suppose the "all" setting you mentioned should be the "ALL" instead of "ALL (weighted average)" and "ALL (average)." Is it correct?

Many thanks!

princeton-nlp / SimCSE

Unmatched description in the code and in the paper #150