Evaluation on the basis of Precision and Recall

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

https://www.sbert.net

Apache License 2.0

14.96k stars 2.45k forks source link

Evaluation on the basis of Precision and Recall #259

Open wadhwasahil opened 4 years ago

wadhwasahil commented 4 years ago

Hi,

So, my task is similar to sentence pair classification(similar to nli training). However, I need to evaluate the performance of my model based on precision and recall. I was reading your code, and so if I use encode functionality I would basically get embeddings one for each sentence. However, I was wondering in addition to embeddings I could get probability of two sentences being positive(or negative)?

Thanks

nreimers commented 4 years ago

Hi @wadhwasahil This is currently not implemented.

The question is how you want to use it, i.e., in which setup. A common application is for example for identify duplicate questions in a large corpus of thousands or Millions of questions.

In that case, you would usually need to find a threshold. Sentences/Questions with cosine similarity above a threshold would be considered duplicate. Then, you can compute precision / recall.

It is currently not implemented in the code. But implementing it your self is not that difficult.

faicalbounedjar commented 1 year ago

can you provide more information please ? @nreimers i am a sentence transformer and i want to evaluate the model after fine tuning it (i dont want the accuracy iwant the precision ,recall and the f1 score) i am using natcat dataset thank you for your time