csmetrics / csmetrics.net

webapp for csmetrics.net
http://csmetrics.net/
Creative Commons Attribution 4.0 International
28 stars 7 forks source link

Geometric average on measured and predicted #63

Open wpzdm opened 2 years ago

wpzdm commented 2 years ago

Hi,

Could you share what's the motivation behind the geometric average on the measured and predicted? Why not just arithmetic average?

I often feel the current geometric average gives counterintuitive results, often because one of the measured and predicted is small. Yeah, there is an \epsilon added to the original value, but it seems not quite work. I mean, let's say, even if the predicted value is not near zero, but maybe 1/5 of the measured, the result would seem to be over-punished. That is, the averaged value of univ_A would be lower than that of univ_B, even if measured_univ_A is quite larger than measured_univ_B!

Another way to look at this. Suppose we set \alpha (the weight) to be 0.5, meaning the same importance of the measured and predicted. According to the geometric average, the rank is now determined by measured ✖️ predicted, and seems unintuitive.

It seems to me arithmetic average just works, but maybe I'm missing something.

Thanks, Abel

shinminjeong commented 2 years ago

Hi Abel @wpzdm Thank you for sharing your thought!

Our team also considered both geometric and arithmetic mean, and decided to use geometric mean. This is because when numbers differ widely in scale, the arithmetic mean tends to ignore the smaller numbers and be driven only by the larger ones. The geometric mean is more equitable.

Kind regards, Minjeong

wpzdm commented 2 years ago

Hi Minjeong,

Thanks for the clarification. Now I understand the motivation.

Maybe, what's problematic for me is just that the geometric mean is too equitable. Continue my above idea, let's set \alpha (the weight) to be 0.5. Then univ_A with measured=3000 and predicted=3000 will be ranked much higher than univ_B with measured=5000 and predicted=1000, because the latter is less "equitable". But this seems to counter our initial intention, because weight = 0.5 means the measured and predicted are of the same importance, so univ_A and univ_B should be a tie. There seems no good ways to resolve this with the geometric mean.

On the other hand, the arithmetic average hasn't the above problem. Moreover, if someone really thinks univ_B should rank lower, s/he could just weight more to the predicted.

Anyway, I think we don't need to choose between the two. Why not just build both into the site, it should be just a few lines of code? How does it sound to you?

Best, Abel