Explanation of rolling window both in cross_validation and performance_metrics

facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

https://facebook.github.io/prophet

MIT License

18.48k stars 4.53k forks source link

Explanation of rolling window both in cross_validation and performance_metrics #941

Closed Jchen1628 closed 5 years ago

Jchen1628 commented 5 years ago

Hi everyone,

I have already used Prophet successfully for several forecasts analysis, because of its usability and the approach of it is also well comprehensible. Now in terms of evaluating each of my forecasts, I am using the provided cross_validation and performance_metrics functions. Here I have experimenting with the parameter called rolling windows by using different values within [0,1] (0 calculate the error record by record and 1 calculation across all records by averaging the error). But despite the description it is not clear to me how e.g. the MAPE is derived with the default = 10%. Could someone explain the idea behind of it with a plausible example for MAPE?

Many thanks in advance!

Rendiere commented 5 years ago

I've also been grappling with understanding this.

What's confusing is the part where the cross validation results are sorted by horizon. Using a rolling window on the data sorted in this way means we are calculating the mean over w predictions, but with some values of w we will be taking the mean over multiple horizons.

Would really appreciate an intuitive explanation of what's going on there :)

Rendiere commented 5 years ago

For example, why do we not group by horizon and calculate the metric over all values for that horizon? This would be calculating the mean error over all cross validation folds, for each horizon between 1 day (assuming day resolution) and the chosen forecasting horizon.

Snippet:

from fbprophet.diagnostics import  cross_validation
from sklearn.metrics import mean_absolute_error as mae

# Making 3 months forecast every 10 days trained on 1 year initial data
cv_df = cross_validation(model, horizon='92 days', period='10 days', initial='365 days')

cv_df['horizon'] = cv_df.ds - cv_df.cutoff

cv_df.sort_values(by='horizon', inplace=True)

# Calculate MAE over all folds at each horizon
cv_results = cv_df.groupby('horizon').apply(lambda x: mae(x.y, x.yhat))

Would be happy to hear why the method implemented by prophet is preferred over that.

bletham commented 5 years ago

The way this is currently done for daily data is definitely a bit awkward. I have a comment here that explains in detail the current approach:

https://github.com/facebook/prophet/issues/839#issuecomment-462524968

This approach was taken with irregular data in mind, but with regular (in particular daily) data it definitely makes sense to first aggregate over day. I plan to make that change for the next version.

bletham commented 5 years ago

The improved approach has now been pushed to PyPI and CRAN.