Open leschaf opened 2 years ago
Hi, the argument y_test_df
is a pandas df panel with columns unique_id, ds, y, y_hat_naive2
. So your y_test_df
must include the naive 2 predictions to calculate the owa.
Thank you - that helped!
Any comment on the use of X_test_df?
Hi,
I was searching for OWA implementations to use the measure in one of my projects. I'm starting with calculation for a single time series. The ESRNN lib provides this function:
final_owa, final_mase, final_smape = evaluate_prediction_owa(y_hat_df, y_train_df, X_test_df, y_test_df, naive2_seasonality=1)
When I look at the code (https://github.com/kdgutier/esrnn_torch/blob/master/ESRNN/utils_evaluation.py#L370-L400), X_test_df is not used at all in the function - is that correct?
Also, I'm not sure about the input format for using the function.
Here is my input for y_train_df, which contains the historical target values:
unique_id | ds | y -- | -- | -- 00010_000030_BS824_2018-01-01 | 2017-12-01 | 1497400.0 00010_000030_BS824_2018-01-01 | 2017-11-01 | 1707420.0 00010_000030_BS824_2018-01-01 | 2017-10-01 | 1989485.0 00010_000030_BS824_2018-01-01 | 2017-09-01 | 1697800.0 00010_000030_BS824_2018-01-01 | 2017-08-01 | 1574400.0 00010_000030_BS824_2018-01-01 | 2017-07-01 | 1260556.0 00010_000030_BS824_2018-01-01 | 2017-06-01 | 1319198.0 00010_000030_BS824_2018-01-01 | 2017-05-01 | 1592793.0 00010_000030_BS824_2018-01-01 | 2017-04-01 | 1575775.0 00010_000030_BS824_2018-01-01 | 2017-03-01 | 1808200.0 00010_000030_BS824_2018-01-01 | 2017-02-01 | 1365519.0 00010_000030_BS824_2018-01-01 | 2017-01-01 | 1904000.0 00010_000030_BS824_2018-01-01 | 2016-12-01 | 1713520.0 00010_000030_BS824_2018-01-01 | 2016-11-01 | 1908281.0 00010_000030_BS824_2018-01-01 | 2016-10-01 | 1737900.0 00010_000030_BS824_2018-01-01 | 2016-09-01 | 2005440.0 00010_000030_BS824_2018-01-01 | 2016-08-01 | 1683500.0 00010_000030_BS824_2018-01-01 | 2016-07-01 | 1179682.0 00010_000030_BS824_2018-01-01 | 2016-06-01 | 1834500.0 00010_000030_BS824_2018-01-01 | 2016-05-01 | 1949500.0 00010_000030_BS824_2018-01-01 | 2016-04-01 | 1811450.0 00010_000030_BS824_2018-01-01 | 2016-03-01 | 2001200.0 00010_000030_BS824_2018-01-01 | 2016-02-01 | 1273837.0Here is the y_hat_df, which contains my model predictions for the future values:
unique_id | ds | y_hat -- | -- | -- 00010_000030_BS824_2018-01-01 | 2018-01-01 | 1403634.0 00010_000030_BS824_2018-01-01 | 2018-02-01 | 1543464.0 00010_000030_BS824_2018-01-01 | 2018-03-01 | 1751357.0 00010_000030_BS824_2018-01-01 | 2018-04-01 | 1874214.0 00010_000030_BS824_2018-01-01 | 2018-05-01 | 1810092.0 00010_000030_BS824_2018-01-01 | 2018-06-01 | 1811571.0 00010_000030_BS824_2018-01-01 | 2018-07-01 | 1828860.0 00010_000030_BS824_2018-01-01 | 2018-08-01 | 1708163.0 00010_000030_BS824_2018-01-01 | 2018-09-01 | 1672521.0 00010_000030_BS824_2018-01-01 | 2018-10-01 | 1809456.0 00010_000030_BS824_2018-01-01 | 2018-11-01 | 1870753.0 00010_000030_BS824_2018-01-01 | 2018-12-01 | 1596886.0 00010_000030_BS824_2018-01-01 | 2019-01-01 | 1253630.0 00010_000030_BS824_2018-01-01 | 2019-02-01 | 1618861.0 00010_000030_BS824_2018-01-01 | 2019-03-01 | 1466855.0 00010_000030_BS824_2018-01-01 | 2019-04-01 | 1677125.0 00010_000030_BS824_2018-01-01 | 2019-05-01 | 1887335.0 00010_000030_BS824_2018-01-01 | 2019-06-01 | 1576052.0And finally, here is my y_test_df, which contains the true future values with the same dates as in y_hat_df:
Upon calling
evaluate_prediction_owa
I get, on this line:y_hat_id = y_hat_panel[top_row:bottom_row].y_hat.to_numpy()
the following error - any idea why that happens? What am I missing?AttributeError Traceback (most recent call last) ~/projects/semco/semicon-forecast/src/a4_benchmark.py in
----> 1 evaluate_prediction_owa(y_hat_df, y_train_df,
2 None, y_test_df,
3 naive2_seasonality=12)
4
~/miniconda3/envs/semicon/lib/python3.8/site-packages/ESRNN/utils_evaluation.py in evaluate_prediction_owa(y_hat_df, y_train_df, X_test_df, y_test_df, naive2_seasonality) 390 y_insample = y_train_df.filter(['unique_id', 'ds', 'y']) 391 --> 392 model_owa, model_mase, model_smape = owa(y_panel, y_hat_panel, 393 y_naive2_panel, y_insample, 394 seasonality=naive2_seasonality)
~/miniconda3/envs/semicon/lib/python3.8/site-packages/ESRNN/utils_evaluation.py in owa(y_panel, y_hat_panel, y_naive2_panel, y_insample, seasonality) 350 total_mase = evaluate_panel(y_panel, y_hat_panel, mase, 351 y_insample, seasonality) --> 352 total_mase_naive2 = evaluate_panel(y_panel, y_naive2_panel, mase, 353 y_insample, seasonality) 354 total_smape = evaluate_panel(y_panel, y_hat_panel, smape)
~/miniconda3/envs/semicon/lib/python3.8/site-packages/ESRNN/utils_evaluation.py in evaluate_panel(y_panel, y_hat_panel, metric, y_insample, seasonality) 316 top_row = np.asscalar(y_hat_panel['unique_id'].searchsorted(u_id, 'left')) 317 bottom_row = np.asscalar(y_hat_panel['unique_id'].searchsorted(u_id, 'right')) --> 318 y_hat_id = y_hat_panel[top_row:bottom_row].y_hat.to_numpy() 319 assert len(y_id)==len(y_hat_id) 320
~/miniconda3/envs/semicon/lib/python3.8/site-packages/pandas/core/generic.py in getattr(self, name) 5463 if self._info_axis._can_hold_identifiers_and_holds_name(name): 5464 return self[name] -> 5465 return object.getattribute(self, name) 5466 5467 def setattr(self, name: str, value) -> None:
AttributeError: 'DataFrame' object has no attribute 'y_hat'