pySTEPS / pysteps

Python framework for short-term ensemble prediction systems.
https://pysteps.github.io/
BSD 3-Clause "New" or "Revised" License
457 stars 166 forks source link

Conditioning of rank histograms #34

Closed loforest closed 5 years ago

loforest commented 5 years ago

https://github.com/pySTEPS/pysteps/blob/40572e2465675c95fda689f9107739a4771967a8/pysteps/verification/ensscores.py#L167

The current implementation of rank histograms is not optimal if we set the threshold X_min at higher values (e.g. 10 mm/h). In such cases, the condition for ignoring pairs of observations and forecasts is not enough restrictive. This is especially visible when all the M ensemble members except one are equal to 0. If the observation is 0, it is randomly assigned in the first M-1 bins. If the observation is larger than the only ensemble member that is different from 0 (which occurs often), it is added to the M+1 bin. The probability of being in the Mth bin is therefore very low. In addition, the histogram is flat for all bins up to M-1 (due to random assignment), which is a bit misleading. I am wondering how this effect of random assignment is also impacting the rank histograms for lower values of the X_min threshold.

pulkkins commented 5 years ago

The issue of missing values in the bin M has been fixed in commit 8e5088c.