h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.94k stars 2k forks source link

Add Time Series #9531

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

+Some References:+ https://github.com/robjhyndman/forecast http://stackoverflow.com/questions/22140180/java-api-for-auto-regression-ar-arima-time-series-analysis/25922381#25922381 http://rforge.net/rJava/index.html http://www.analyticsvidhya.com/blog/2015/12/complete-tutorial-time-series-modeling/

+Possible Algos to add:+ Simple Exponential Smoothing Double Exponential Smoothing Triple Exponential Smoothing AR Model MA Model ARMA ARIMA

+Moving Averages:+ Simple Moving Average Cumulative Moving Average Exponential Moving Average

+Statistical Calculations:+ ACF PACF

+Transformations:+ Box Cox Log Root

exalate-issue-sync[bot] commented 1 year ago

Arno Candel commented: Nice-to-have for best user experience is sorting (PUBDEV-2592) such that the user can sort a date column etc.

We can start with one equidistant (and obviously pre-sorted) time series per row in a numeric frame, and just fit ARIMA models on that, and add columns with future-time predictions.

exalate-issue-sync[bot] commented 1 year ago

Arno Candel commented: Mark's quick demo of how to use DL for TS: {code}

install.packages("fpp")

library(fpp)

Electricity data also available at http://robjhyndman.com/forecasting/data/

ts<-as.data.frame(as.numeric(elec)) dim(ts) ##476 rows plot(ts[,1]) ## generally increasing, with seasonality ts

data setup to use neural networks

lagpad <- function(x, k) {c(rep(NA, k), x)[1 : length(x)] } lagging<-as.data.frame(matrix(0,nrow(ts),12)) for(i in 1:12){lagging[,i]<-lagpad(ts[,1],i)} tsLagged<-cbind(ts,lagging,seq(1:nrow(ts))) colnames(tsLagged)<-c("electricityUsage","l1","l2","l3","l4","l5","l6","l7","l8","l9","l10","l11","l12","monthNum") tsLagged[6:18,]

library(h2o) h<-h2o.init(nthreads = -1,max_mem_size = '8G')

load data into cluster

tsHex<-as.h2o(tsLagged,destination_frame = 'ts.hex')

run deep learning against the time series: all but final year

dl<-h2o.deeplearning(x=c(2:14),y=1,training_frame = tsHex[1:464,],model_id = "tsDL",epochs = 1000,hidden=c(50,50)) summary(dl)

predict final year

dlP<-h2o.predict(dl,newdata = tsHex[465:476,])

plot

plot(ts[1:464,1],type='l',main="H2O Deep Learning") points(as.data.frame(dlP)[,1],x = 465:476,type='p',col="blue")

quickly use forecast package to show what Arima will do

library(forecast) myts <- ts(ts[,1], start=c(1950, 1), end=c(1989, 8), frequency=12) fit <- stl(myts, s.window="period") plot(fit) autoArima<- auto.arima(window(myts, start=c(1950, 1), end=c(1989, 8))) pAA<-forecast(autoArima,12) plot(pAA) pAA$model$series {code}

exalate-issue-sync[bot] commented 1 year ago

Arno Candel commented: Same for SP500 {code} library(data.table) ts <- fread("~/Desktop/sp500.csv") ts <- ts[order(nrow(ts):1),] nrow(ts) ts Ttrain <- 1:16000 Ttest <- 16001:16649 head(ts[Ttest,]) ts <- ts$Adj Close ts <- as.data.frame(ts) plot(ts[,1],type='l',col="black") lines(ts[Ttrain,1],type='l',col="blue") lines(ts[Ttest,1],x=Ttest,type='l',col="red")

data setup to use neural networks

lagpad <- function(x, k) {c(rep(NA, k), x)[1 : length(x)] } lagging<-as.data.frame(matrix(0,nrow(ts),365)) for(i in 1:365){lagging[,i]<-lagpad(ts[,1],i)} tsLagged<-cbind(ts,lagging,seq(1:nrow(ts))) colnames(tsLagged) <- make.names(names(tsLagged)) colnames(tsLagged)<-c("target",paste0(c("l"),1:365),"dayNum")

tsLagged

library(h2o) h<-h2o.init(nthreads = -1,max_mem_size = '8G')

load data into cluster

tsHex<-as.h2o(tsLagged,destination_frame = 'ts.hex') train <- tsHex[Ttrain,] test <- tsHex[Ttest,]

run deep learning against the time series: all but final year

dl<-h2o.deeplearning(x=c(2:ncol(tsHex)),y=1,training_frame = train,model_id = "tsDL",epochs = 100,hidden=c(50,50))

summary(dl)

predict final year

dlP<-h2o.predict(dl,newdata = test)

plot

plot(ts[,1],type='l',col="black", main="H2O Deep Learning") lines(as.data.frame(dlP)[,1],x = Ttest,type='l',col="blue",lw=5)

quickly use forecast package to show what Arima will do

library(forecast) myts <- ts(ts[,1], start=c(1950, 1), end=c(2016, 3), frequency=12) fit <- stl(myts, s.window="period") plot(fit) autoArima<- auto.arima(window(myts, start=c(1950, 1), end=c(2013, 8))) pAA<-forecast(autoArima,12) plot(pAA) pAA$model$series {code}

exalate-issue-sync[bot] commented 1 year ago

Jan Gorecki commented: Any chance for extending scope for adaptive moving average?

Especially FRAMA, described by John Ehlers in [FRactal Adaptive Moving Average technical paper|http://www.mesasoftware.com/papers/FRAMA.pdf].

R implementation of FRAMA was published by Ilya Kipnis in [DSTrading|https://github.com/IlyaKipnis/DSTrading] package, described in his blog post [The Continuing Search For Robust Momentum Indicators: the Fractal Adaptive Moving Average|https://quantstrattrader.wordpress.com/2014/06/22/the-continuing-search-for-robust-momentum-indicators-the-fractal-adaptive-moving-average/]. Other adaptive moving averages are included in the package.

FRAMA, and the interface to Adaptive Moving Averages in general, would be a valuable feature for nonseasonal time series data.

h2o-ops commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-2590 Assignee: New H2O Bugs Reporter: Arno Candel State: Open Fix Version: N/A Attachments: Available (Count: 3) Development PRs: N/A

Attachments From Jira

Attachment Name: Arima.png Attached By: Arno Candel File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-2590/Arima.png

Attachment Name: H2ODeepLearning.png Attached By: Arno Candel File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-2590/H2ODeepLearning.png

Attachment Name: sp500.csv Attached By: Arno Candel File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-2590/sp500.csv