Energy Score - Githubissues

AlessandroFB commented 3 weeks ago

Hello Sir, very sorry to disturb during this holiday period. I'm nearing the end of my internship and the calculations for the Energy Score and Variogram Score were successful thanks to your help and clarifications you provided. The image below is one example of a day-ahead forecast for one station. In fact, the same was done for 4 stations simultaneously.

So the forecast matrix for each day was of size (40, 24, 4) with 40 being the no. of ensembles, 24 being the horizon and 4, the number of stations. The observation vector for every single day of the test year (2016) was of shape (24, 4).

My tutor had doubts concerning both scores. He was wondering whether we should've obtained one single score for every day (that is the same Energy and Variogram score for every station) or one Energy and Variogram score per station.

My work was to translate to Python a benchmark model by van der Meer (2021), the Multivariate Probabilistic Ensemble (MuPEn). In his work, van der Meer (2021) had one Energy score and Variogram score for each station. My tutor was wondering whether that was the correct way or if all stations should have the same score since we're forecasting for all stations simultaneously.

Can you help me better understand this please and/or refer some works that might enlighten me on this please ?

Thank you for your understanding and sorry again for bothering during holiday period.

Respectfully Yours,

Alessandro Fabiani Bigaunah plot_2016-01-01_edflapossession plot_2016-01-01_edfsaintandre plot_2016-01-01_edfsaintleu plot_2016-01-01_edfsaintpierre

frazane commented 3 weeks ago

Hi Alessandro,

as written in the documentation, by default when using the energy score or the variogram score the ensemble dimension corresponds to the second last axis, and the variable dimension corresponds to the last axis, unless specified otherwise with m_axis and v_axis.

Let's consider concrete examples with the defaults.

If you want to compute one score for each station, then you need to pass an array of observations of shape (4, 24) and a forecasts array of shape (4, 40, 24). You will obtain an array of shape (4,), which is the score for each of the four stations.
If you want to compute a single score for all stations for a single day, then you need to rearrange your array and stack the stations and time dimensions together. Then you pass an array of observations of shape (96,) (96 = 24 * 4) and a forecasts array of shape (40, 96). You will obtain an array of shape () (a scalar), which is the score for all stations and timesteps combined.

As for "which one is best", it's entirely up to you and what you are looking for! If you are interested in questions like "what is the chance that the GHI is above threshold X at all stations simultaneously", then you care about the multivariate structure between stations, and you would need to go with the second approach. If you only care about the performance of each station individually, then you can go with the first approach.

AlessandroFB commented 3 weeks ago

Hello Sir,

Thanks a lot for your response. Thanks again for all the help you provided.

Respectfully Yours,

Alessandro F. B

frazane / scoringrules

Energy Score #35