Open Wang-Yufei opened 2 years ago
Thanks, I need to update the links.
The datasets can be found here: https://huggingface.co/datasets/sentence-transformers/embedding-training-data
They are in a jsonl format now, not a .tsv. So you might need to update that script to be compatible with jsonl
Hi~ I want to train a huggingface model with MultipleNegativesRankingLoss, and I find the datasets from the tabel(https://www.sbert.net/examples/training/paraphrases/README.html#datasets) can not be downloaded. Can you give me an example of a program that shows how to change from the original data file to the file required for training?(https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/paraphrases/training.py) Thanks a lot!