Closed Jchen1628 closed 5 years ago
I've also been grappling with understanding this.
What's confusing is the part where the cross validation results are sorted by horizon
. Using a rolling window on the data sorted in this way means we are calculating the mean over w
predictions, but with some values of w
we will be taking the mean over multiple horizons.
Would really appreciate an intuitive explanation of what's going on there :)
For example, why do we not group by horizon and calculate the metric over all values for that horizon? This would be calculating the mean error over all cross validation folds, for each horizon between 1 day (assuming day resolution) and the chosen forecasting horizon.
Snippet:
from fbprophet.diagnostics import cross_validation
from sklearn.metrics import mean_absolute_error as mae
# Making 3 months forecast every 10 days trained on 1 year initial data
cv_df = cross_validation(model, horizon='92 days', period='10 days', initial='365 days')
cv_df['horizon'] = cv_df.ds - cv_df.cutoff
cv_df.sort_values(by='horizon', inplace=True)
# Calculate MAE over all folds at each horizon
cv_results = cv_df.groupby('horizon').apply(lambda x: mae(x.y, x.yhat))
Would be happy to hear why the method implemented by prophet is preferred over that.
The way this is currently done for daily data is definitely a bit awkward. I have a comment here that explains in detail the current approach:
https://github.com/facebook/prophet/issues/839#issuecomment-462524968
This approach was taken with irregular data in mind, but with regular (in particular daily) data it definitely makes sense to first aggregate over day. I plan to make that change for the next version.
The improved approach has now been pushed to PyPI and CRAN.
Hi everyone,
I have already used Prophet successfully for several forecasts analysis, because of its usability and the approach of it is also well comprehensible. Now in terms of evaluating each of my forecasts, I am using the provided cross_validation and performance_metrics functions. Here I have experimenting with the parameter called rolling windows by using different values within [0,1] (0 calculate the error record by record and 1 calculation across all records by averaging the error). But despite the description it is not clear to me how e.g. the MAPE is derived with the default = 10%. Could someone explain the idea behind of it with a plausible example for MAPE?
Many thanks in advance!