UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.83k stars 2.44k forks source link

Example training of quora_duplicate_questions : Test Dataset Evaluation #685

Closed working12 closed 3 years ago

working12 commented 3 years ago

In this example script link, we can see two splits, one is the training dataset (train pairs) and another portion is for validation (dev_pairs). Usually, in the pipeline, we sometimes use the final test dataset. (I have a test pairs portion as well over which I need to run the final trained model).

So, in case we need to run it at the end of the training and check it, how can we do it? Do we need to create an evaluator for the test dataset and append it to the evaluators? (Just like the 3 dev pairs evaluator in the example)

Thanks.

nreimers commented 3 years ago

Hi @working12 Yes, you can use again the evaluator on your test set. Or you implement your own code to test the performance on the test set.

For an example that uses an evaluator class, see: https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/sts/training_stsbenchmark.py

working12 commented 3 years ago

@nreimers Thanks. You are the best.