princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
MIT License
3.33k stars 505 forks source link

code for other data augmentations #219

Closed ShaobinChen-AH closed 1 year ago

ShaobinChen-AH commented 1 year ago

Hi! I would like to reproduce all results. However, you didn't upload the code for other data augmentations (e.g., crop, deletion), right?

gaotianyu1350 commented 1 year ago

Hi,

Thanks for your interest in our work! We don't have them in the public repo but those augmentations should be easy to implement. You can preprocess the data and then feed the parallel data to our repo.