pySTEPS / pysteps

Python framework for short-term ensemble prediction systems.
https://pysteps.github.io/
BSD 3-Clause "New" or "Revised" License
441 stars 160 forks source link

CRPS calculation #362

Closed CRiesewijk closed 3 months ago

CRiesewijk commented 3 months ago

Hi,

For my master thesis, I am investigating what the potential added value is of probabilistic forecasting in inventory management. I now made a probabilistic forecast with the Kernel Density Estimation, and want to measure the accuracy of this pdf with pysteps - CRPS.

However, the CRPS package does this: pysteps.verification.probscores.CRPS(X_f, X_o)

X_f (array_like) – Array of shape (k,m,n,…) containing the values from an ensemble forecast of k members with shape (m,n,…). X_o (array_like) – Array of shape (m,n,…) containing the observed values corresponding to the forecast.

But, I want to calculate the accuracy of one point, so then you have one observed value, and a forecasted pdf at that point (with quantities and densities). I think its is strange that you don't include the densities of the pdf in this package.

Therefore, I did some coding myself

CRPS

observed_value = synthetic_data_df[synthetic_data_df['dates'] == date]['quantity'].values

observed_cdf = pd.DataFrame({
    'Quantity': cdf_df_forecast_Gaussian_1['Quantity'],
    'Observed_CDF': [0 if observed_value  >= q else 1 for q in cdf_df_forecast_Gaussian_1['Quantity']]
})
squared_difference_Gaussian_1 = (cdf_df_forecast_Gaussian_1 ['CDF'] - observed_cdf['Observed_CDF']) ** 2
crps_Gaussian_1 = trapz(squared_difference_Gaussian_1 , cdf_df_forecast_Gaussian_1 ['Quantity'])
crps_Gaussian_1_list.append(crps_Gaussian_1)

I was wondering if I interpret the CRPS package wrong and/or how the CRPS package work for my case. Hope you can help me, since I think there must be an appropriate package for it.

dnerini commented 3 months ago

Hi @CRiesewijk thanks for getting in touch. I'm not sure I'm getting it right, but in essence I'd say that we have a minimal implementation of the crps that cover our use case (that is, ensemble forecasting). If you need a more extensive set of methods, I suggest that you look into more specialized packages. For example, I would strongly recommend this excellent library mantained by @frazane https://github.com/frazane/scoringrules.

Hope this helps. I'll close this issue, feel free to reopen it if you feel this needs more inputs from our side.