ykdojo / personalized_search_challenge

Attempt on a Kaggle competition, Personalized Web Search Challenge, hosted by Yandex (http://www.kaggle.com/c/yandex-personalized-web-search-challenge)
12 stars 4 forks source link

Calculate the global means and skipped means using 2^rel instead of plain rel. #47

Open ykdojo opened 10 years ago

ykdojo commented 10 years ago

http://www.kaggle.com/c/yandex-personalized-web-search-challenge/details/evaluation https://www.kaggle.com/wiki/NormalizedDiscountedCumulativeGain

In the nDCG calculation, the gain is introduced by 2^relevance_rate instead of plain relevance_rate.

So the actual relevance rates used for calculations are: 1, 2, 4 (= 0^2, 1^2, 2^2) instead of: 0, 1, 2

So, perhaps we should calculate the relevance means using (1,2,4) instead of (0,1,2).

Another thing to do is to calculate the ratio for each relevance rate for each rank. If we do this, the output will be like:

<rank 1> 2: 45% 1: 35% 0: 20%

<rank 2> ... (and so on)