Closed lihuiliullh closed 1 year ago
Hi, in general P@1 is calculated by judging whether the top 1 ranked predictions made by our model for one particular question is correct, whereas for Hits@5 we look at the top 5 ranked predictions. In our ConvRef dataset we have collected for each intent up to 4 reformulations. So what we mean here by the sentence you underlined is that if any of these reformulations for one intent get to the correct answer (the top-1 prediction of the model is correct) we have P@1=1 (so basically our QA model has in fact several tries to get to the correct answer). I hope this clarifies it a bit more.
Best Regards, Magdalena
Thanks. is it possible that reformulation1 and reformulation2 have the same top-1 prediction? For example, the top1 prediction of reformulation1 is A which is a wrong answer. If the top1 prediction of reformulation2 is also A, will you choose a different one?
Yes, it is possible that the model predicts the same wrong top-1 answer for two reformulations. Yes, if we have further reformulations available for the respective intent, we would issue a further reformulation.
If no further reformulation is available, is the p@1 for this query 0?
Yes if the top-1 prediction was wrong and no further reformulations are available p@1 is 0.
Can you elaborate a little bit more about how to calculate the p@1 metric?
If p@1 is 0 when the correct answer is not found after five turns, why isn't p@1 the same as hit@5?