Open MotamedNia opened 4 years ago
An MSE loss of between 2 - 4 sounds good.
One option is to use the translation evaluator: https://www.sbert.net/docs/package_reference/evaluation.html#sentence_transformers.evaluation.TranslationEvaluator
You pass a list (like 1k - 10k) of parallel sentences that you have not seen at training. It then tries to find for each source sentence the corrected translated target sentence and prints out an accuracy score.
Scores 90 - 95% accuracy are quite good. This shows, that the vector spaces are well aligned for the two languages.
I am very thankful you are considering my problem. Comprehensive solution :)
Hi I want to use your pre-trained model for Semantic Search. I created a parallel dataset which contains the academic paper title in different languages. I used prepared code to train a multilingual model. Now I want to know what is acceptable MSE loss value that shows the model is trained perfectly. I reached 3.xxx, should I continue the training process? And there is a method to evaluate the model? there is no sts dataset for my target language.
Thank you for your consideration,