sinanshi / LPJmL_Calibration

0 stars 1 forks source link

willmott #18

Closed mfader closed 9 years ago

mfader commented 9 years ago

willmott is computed with a different formula than in the version of Chistoph, why?

Christoph: willmott <- 1 - sum((p - o)^2) / sum((abs(p - mean(o)) + abs(o - mean(o)))^2)

Sinan: willmott <- 1 - sum(abs(p - o)) / sum((abs(p - mean(o)) + abs(o - mean(o))))

sinanshi commented 9 years ago

I have realised that this is indeed a real issue. We have gone through this one January. You can check your OT-Med email account.

http://climate.geog.udel.edu/~climate/publication_html/Pdf/WRM_IJOC_2012.pdf

I think Willmott has a bad performance while the sample number is extremely small, the square increased the error in a very large magnitude. On the second page they comapred the difference of these two equations. And they said the non-squared version has been adopted at the end.

mfader commented 9 years ago

Awesome! Then it means it is right for the current case, but would be wrong for the global runs. I will just add a comment on the willmott function and the init.r! That's it!

sinanshi commented 9 years ago

No, I think you should pose this issue to LPJ community esspecially Christoph, since he is the one adopting the old version. I read all these paper Willmott, 1985, I found in our case, statistically we should adopt the new version.

mfader commented 9 years ago

yes, sure, but PIK runs at global level, there the sample size should always be "good", so the error with the equation with the squares should be small

sinanshi commented 9 years ago

No, the sample size the fundamental problem, while the sensitivity to outliers are. Since we have much less sample, the outliers has a much more significant impact on this.

d1 also is much less sensitive to the shape of the error-frequency distribution and, as a
consequence, to errors concentrated in outliers.
sinanshi commented 9 years ago

The square increased too much the error distance, which is a bit aggresive.

mfader commented 9 years ago

ok, but it is what I am saying too. That means, for our case, with a small sample size (=Med. countries), we should for sure use the equation without squares. For global runs like the ones done at PIK, since the sample size is large (mainly around 150 countries), using the equation with squares is not a big error. Do I get that right?

sinanshi commented 9 years ago

Right! However, it is neccessary to change for global too. Otherwise we will have two metrics for the same system. We can keep on that right now. But to be prudent, we can ask the question to the whole community. I have just read these two papers, and I didn't read anything else about this issue. I don't know wheather if the Christoph's version (i.e. d ) is a convention in vegetation modelling or they have some other statistical concerns.

mfader commented 9 years ago

yes, good idea; it is a good question that I can pose when we discuss about the merging with the trunk. For now, we can stick with our version.

sinanshi commented 9 years ago

Yes