h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.78k stars 1.99k forks source link

Time Series forecasting: deep learning wrapper with lagging variables #14794

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

As a first step into time series forecasting, add an R wrapper into deep learning to forecast.

A common use of neural networks in forecasting is to take the prior period values as columns and run a deep learning model to automatically detect the seasonality. R's forecast package does this, and is conceptually the same: minimal pre-processing and then use an out of the box neural network model.

The R forecast method runs multiple neural network models to smooth the output. Empirically, it is still almost always less stable than traditional time series models.

A full featured wrapper would include a parameter for the maximum number of lagged terms (periodicity). Common values would be 12, 52, 365.

Eventually having this available at the Java level would be ideal so Flow can utilize it as well.

Expected difficulty is in handling edge cases related to low counts, gaps in data, and irregular time series.

Conceptually train/validation should be different for time series modeling. However, the baseline (R:forecast) does no such backtesting.

DinukaH2O commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-1836 Assignee: Mark Landry Reporter: Mark Landry State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A