klarna / product-page-dataset

50 stars 10 forks source link

How to compute the predictive accuracy? #20

Closed yyzhuang1991 closed 2 years ago

yyzhuang1991 commented 2 years ago

I wonder what function in the evaluation script is used to compute the predictive accuracy. Is it the predict_on_test_set() or eval_on_test_set() in this script ? If it is the first one, what did you use for the top-n value to report the predictive accuracy in Table 3 in the appendix?

Thanks in advance.

stefanmagureanu commented 2 years ago

Hello and thanks for reaching out! I believe this part returns the candidates and this part counts correct answers in top-N. In the paper we only present the top-1 results as these are the most relevant in practice and therefore make for the best benchmark. Let us know if you need more info or if this doesn't answer your question! Best wishes, Stefan.

yyzhuang1991 commented 2 years ago

I see. That helps! Do you have any idea about how the precision recall and F1 scores look like in this task?

stefanmagureanu commented 2 years ago

Precision and recall do not differ in this prediction/nomination task. So here accuracy=recall=precision since for each and every page there is exactly one correct answer (labeled element per class) and the algorithm can either get it right or wrong. At the end, we measure how many answers it got right. In the classification task we sample a bunch of elements and ask the classifiers to label them - and here F1 score is more representative. Does this make sense?

yyzhuang1991 commented 2 years ago

That makes sense. Thanks.