Closed antoinecarme closed 5 years ago
Nice document by Mathworks !
https://www.mathworks.com/help/econ/rolling-window-estimation-of-state-space-models.html
Forecast evaluation with Stata
Package 'forecastHybrid'
Nice R package that handles cross-validation for time series
https://cran.r-project.org/web/packages/forecastHybrid/forecastHybrid.pdf
Cross validation of time series data is more complicated than regular k-folds or leave-one-out cross validation of datasets without serial correlation since observations xt and xt+n are not independent.
The cvts() function overcomes this obstacle using two methods:
rolling cross validation where an initial training window is used along with a forecast horizon and the initial window used for training grows by one observation each round until the training window and the forecast horizon capture the entire series or
a non-rolling approach where a fixed training length is used that is shifted forward by the forecast horizon after each iteration.
For the rolling approach, training points are heavily recycled, both in terms of used for fitting and in generating forecast errors at each of the forecast horizons from 1:maxHorizon
In contrast, the models fit with the non-rolling approach share less overlap, and the predicted forecast values are also only compared to the actual values once. The former approach is similar to leave-one-out cross validation while the latter resembles k-fold cross validation.
As a result,
rolling cross validation requires far more iterations and computationally takes longer to complete,
but
a disadvantage of the non-rolling approach is the greater variance and general instability of cross-validated errors.
scikit-learn has a time series split cross-validator
from scikit-learn user-guide :
http://scikit-learn.org/stable/modules/cross_validation.html#cross-validation
Hey, Are you working on a solution for the cross-validation ? The facebook Prophet package has something implemented: https://github.com/facebook/prophet/blob/master/notebooks/diagnostics.ipynb
Hey @BenjaminLarrousse
Yet another implementation for this feature. Thanks for the feedback. I will look at it closer.
PyAF is designed to be a standalone product and cannot reuse an existing time series forecasting software (existed way before facebook/prophet was made public).
Are you interested in implementing it ?
Cheers,
Antoine
Yes sure, I was thinking about a specific implementation into your package. But their code can help do that. I don't have much time right now to implement it but if I manage to find some free time, why not !
Cheers
Finished. See #105 for the selected implementation.
PyaF uses a simple separation of the total dataset into estimation/training and test/hold-out datasets (80% and 20% respectively by default, customizable).
Try to evaluate the impact of using cross-validation : gain in model quality/stability/accuracy versus practical aspects (cpu time and memory usage).
Use the "rolling forecasting origin" method described here :
https://www.otexts.org/fpp/2/5