princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
MIT License
3.39k stars 512 forks source link

CLTrainer evaluate, char-level input #175

Closed bujiahao closed 2 years ago

bujiahao commented 2 years ago

In the SimCSE.simcse.trainers.py the 106th line, " sentences = [' '.join(s) for s in batch] " When evaluate , the code will convert a word (e.g. "eye") to a char ( e.g. "e","y","e") why?

gaotianyu1350 commented 2 years ago

Hi,

This is the SentEval format. Here each s in batch is a list of word, e.g., ["I", "have", "a", "pen", "."]. This operation will concatenate them back to a string sentence, "I have a pen .".

bujiahao commented 2 years ago

Thanks for the reply!

github-actions[bot] commented 2 years ago

Stale issue message