frazane / scoringrules

Scoring rules for probabilistic forecast evaluation and optimization.
https://frazane.github.io/scoringrules/
Apache License 2.0
40 stars 6 forks source link

test_energy.py resulting in shape error #19

Closed AlessandroFB closed 5 months ago

AlessandroFB commented 5 months ago

Tried running test_energy.py since attempts on a multivariate forecast weren't conclusive. Ran the test code as provided here. Code returned the following error :


ValueError Traceback (most recent call last) Cell In[12], line 32 29 elif backend == "jax": 30 assert isinstance(res, jax.Array) ---> 32 x = test_energy_score(backend) 33 print(x) 34 print(x.shape)

Cell In[12], line 25, in test_energy_score(backend) 22 obs = np.random.randn(N, N_VARS) 23 fct = np.expand_dims(obs, axis=-2) + np.random.randn(N, ENSEMBLE_SIZE, N_VARS) ---> 25 res = energy_score(obs, fct, backend=backend) 27 if backend in ["numpy", "numba"]: 28 assert isinstance(res, np.ndarray)

File ~\anaconda3\Lib\site-packages\scoringrules_energy.py:52, in energy_score(forecasts, observations, m_axis, v_axis, backend) 20 r"""Compute the Energy Score for a finite multivariate ensemble. 21 22 The Energy Score is a multivariate scoring rule expressed as (...) 49 The computed Energy Score. 50 """ 51 backend = backend if backend is not None else backends._active ---> 52 forecasts, observations = multivariate_array_check( 53 forecasts, observations, m_axis, v_axis, backend=backend 54 ) 56 if backend == "numba": 57 return energy._energy_score_gufunc(forecasts, observations)

File ~\anaconda3\Lib\site-packages\scoringrules\core\utils.py:31, in multivariate_array_check(fcts, obs, m_axis, v_axis, backend) 29 m_axis = m_axis if m_axis >= 0 else fcts.ndim + m_axis 30 v_axis = v_axis if v_axis >= 0 else fcts.ndim + v_axis ---> 31 _multivariate_shape_compatibility(fcts, obs, m_axis) 32 return _multivariate_shape_permute(fcts, obs, m_axis, v_axis, backend=backend)

File ~\anaconda3\Lib\site-packages\scoringrules\core\utils.py:12, in _multivariate_shape_compatibility(fcts, obs, m_axis) 10 o_shape_broadcast = o_shape[:m_axis] + (f_shape[m_axis],) + o_shape[m_axis:] 11 if o_shape_broadcast != f_shape: ---> 12 raise ValueError( 13 f"Forecasts shape {f_shape} and observations shape {o_shape} are not compatible for broadcasting!" 14 )

ValueError: Forecasts shape (100, 3) and observations shape (100, 51, 3) are not compatible for broadcasting!

Please help. Thanks a lot.

frazane commented 5 months ago

Hi Alessandro, thanks for reaching out! I believe the problem here is that the version of scoringrules installed in your environment differs from the version on which you are testing.

In tests, observations come first, which is a breaking change introduced in the latest release https://github.com/frazane/scoringrules/releases/tag/v0.5.0. On the other hand, I can see that the files of your conda environment still have forecasts before observations.

Updating scoringrules in your environment to 0.5.0 should solve the issue. Hope this helps πŸ˜ƒ

AlessandroFB commented 5 months ago

Hello again sir, thank you for your response. I've tried your solution and the code works all fine now :) I can now implement it in my work.

Thanks a lot for the help sir.

Good day to you :)

AlessandroFB commented 5 months ago

Hello again sir and my apologies for bothering. I've run into an issue with the energy and variogram scores. Despite the correct version of scoringrules installed, I'm still running into shape issues with my personal code this time.

I've attached a picture of the error code I'm receiving. But basically, my forecast and observations arrays both have the same shape but the error says they aren't compatible.

Can you help please ?

Thank you for your comprehension and have a good day.

Respectfully Yours,

Alessandro F.B

P.s: If you require any additional files or information, I'll be glad to share.

Screenshot 2024-06-05 110744

frazane commented 5 months ago

Hi Alessandro, you are not bothering at all you're welcome to ask questions πŸ˜„

So first thing I see is that the observations and the forecasts have the same number of dimensions, which should never be the case: forecasts have always an extra dimension which is the ensemble dimension. May I ask what each of the dimensions represent in your arrays?

AlessandroFB commented 5 months ago

Hello again sir, thanks for your quick and kind response.

So the forecast array has dimension (12,24) with 12 being the number of possible outcomes / forecasts and 24 being each time of the day as it is a day-ahead multivariate forecast. The same can be said of the observations array except it has only one row, i.e, dimension of (1 ,24). I've attached one of my graphs as well, I don't know if it can help.

Screenshot 2024-06-05 112532

frazane commented 5 months ago

So if I understand correctly, you are working with a single variable right? The energy score and the variogram score are multivariate metrics so you should only use them for multiple variables (for example if you had multiple stations instead of a single one, and you were interested in the covariance structure of the forecast). I suggest you read some of the references we provide in the documentation πŸ‘

In your case you should rather compute the CRPS by passing the observation array of shape (24,) and the forecast array of shape (24, 12).

AlessandroFB commented 5 months ago

Alright I understand. I have two stations as of right now but the code only processes one by one. I'll read the documentation more thouroughly as well.

AlessandroFB commented 3 months ago

Hello Sir, very sorry to disturb during this holiday period. I'm nearing the end of my internship and the calculations for the Energy Score and Variogram Score were successful. The image below is one example of a day-ahead forecast for one station. In fact, the same was done for 4 stations simultaneously.

So the forecast matrix for each day was of size (40, 24, 4) with 40 being the no. of ensembles, 24 being the horizon and 4, the number of stations. The observation vector for every single day of the test year (2016) was of shape (24, 4).

My tutor had doubts concerning both scores. He was wondering whether we should've obtained one single score for every day (that is the same Energy and Variogram score for every station) or one Energy and Variogram score per station.

My work was to translate to Python a benchmark model by van der Meer (2021), the Multivariate Probabilistic Ensemble (MuPEn). In his work, van der Meer (2021) had one Energy score and Variogram score for each station. My tutor was wondering whether that was the correct way or if all stations should have the same score since we're forecasting for all stations simultaneously.

Can you help me better understand this please and/or refer some works that might enlighten me on this please ?

Thank you for your understanding and sorry again for bothering during holiday period.

Respectfully Yours,

Alessandro Fabiani Bigaunah plot_2016-01-01_edflapossession