Closed skye95git closed 1 year ago
Hi,
If you want to train SimCSE on your own dataset, you can simply replace our training data with your own in the same format. And we already provided an example script in readme.
Hi,
If you want to train SimCSE on your own dataset, you can simply replace our training data with your own in the same format. And we already provided an example script in readme.
Hi, I guess you mean we can prepare data and use the shell script to train own model. But I wonder how to use the installed module (pip install simcse) to train own model.
Hi,
The pip package cannot be used to train your own model. To do this you need to use this github repo and follow the readme.
Hi,
If you want to train SimCSE on your own dataset, you can simply replace our training data with your own in the same format. And we already provided an example script in readme.
Hi, can I train and evaluate SimCSE on my own datasets? Although I can train it on my dataset by setting "--train_file", I don't know how to evaluate SimSCE on my test set. It seems that SimCSE can only evaluate on some specific tasks according to your source code.
Hi,
We use a modified version of SentEval for evaluation. For your own evaluation file you can modify the SentEval part of code. You will have to implement your own evaluation protocol if you want to do a HIT@N (retrieval style) type of evaluation. This repo might be helpful for retrieval-style evaluation: https://github.com/castorini/pyserini
I'm doing search task and the pre-training model I'm using is RoBerta Base. I would like to join SimCSE on this basis, how to use SimCSE on my own data set?