business-science / modeltime.h2o

Forecasting with H2O AutoML. Use the H2O Automatic Machine Learning algorithm as a backend for Modeltime Time Series Forecasting.
https://business-science.github.io/modeltime.h2o/
Other
41 stars 11 forks source link

Error in refit_tbl #20

Open ichsan2895 opened 3 years ago

ichsan2895 commented 3 years ago

I have followed this tutorial, but I got error in this syntax

https://www.business-science.io/code-tools/2021/03/15/introducing-modeltime-h2o.html

refit_tbl %>%
modeltime_forecast(
new_data = future_prepared_tbl,
actual_data = data_prepared_tbl,
keep_data = TRUE
)

Error: Problem with filter() input ..1. x object '.key' not found i Input ..1 is `.model_desc == "ACTUAL" | .key == "prediction"

My software version Windows 10 Education 64 bit R = 3.6.3 (64 bit) H2O = 3.32.0.1 modeltime = 0.5.1 modeltime.h2o = 0.1.1

mdancho84 commented 3 years ago

I just ran the tutorial. I've increased the number of models slightly to improve results. It seems to run OK for me.

library(tidymodels)
library(modeltime.h2o)
library(tidyverse)
library(timetk)

data_tbl <- walmart_sales_weekly %>%
  select(id, Date, Weekly_Sales)

splits <- time_series_split(data_tbl, assess = "3 month", cumulative = TRUE)

recipe_spec <- recipe(Weekly_Sales ~ ., data = training(splits)) %>%
  step_timeseries_signature(Date) 

train_tbl <- training(splits) %>% bake(prep(recipe_spec), .)
test_tbl  <- testing(splits) %>% bake(prep(recipe_spec), .)

h2o.init(
  nthreads = -1,
  ip       = 'localhost',
  port     = 54321
)
#>  Connection successful!
#> 
#> R is connected to the H2O cluster: 
#>     H2O cluster uptime:         3 minutes 24 seconds 
#>     H2O cluster timezone:       America/New_York 
#>     H2O data parsing timezone:  UTC 
#>     H2O cluster version:        3.32.0.1 
#>     H2O cluster version age:    6 months and 18 days !!! 
#>     H2O cluster name:           H2O_started_from_R_mdancho_rvk435 
#>     H2O cluster total nodes:    1 
#>     H2O cluster total memory:   7.96 GB 
#>     H2O cluster total cores:    12 
#>     H2O cluster allowed cores:  12 
#>     H2O cluster healthy:        TRUE 
#>     H2O Connection ip:          localhost 
#>     H2O Connection port:        54321 
#>     H2O Connection proxy:       NA 
#>     H2O Internal Security:      FALSE 
#>     H2O API Extensions:         Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 
#>     R Version:                  R version 4.0.2 (2020-06-22)
#> Warning in h2o.clusterInfo(): 
#> Your H2O cluster version is too old (6 months and 18 days)!
#> Please download and install the latest version from http://h2o.ai/download/

# Optional - Turn off progress indicators during training runs
h2o.no_progress()

model_spec <- automl_reg(mode = 'regression') %>%
  set_engine(
    engine                     = 'h2o',
    max_runtime_secs           = 15, 
    max_runtime_secs_per_model = 15,
    max_models                 = 10,
    nfolds                     = 5,
    exclude_algos              = c("DeepLearning"),
    verbosity                  = NULL,
    seed                       = 786
  ) 

model_fitted <- model_spec %>%
  fit(Weekly_Sales ~ ., data = train_tbl)

modeltime_tbl <- modeltime_table(
  model_fitted
) 

modeltime_tbl
#> # Modeltime Table
#> # A tibble: 1 x 3
#>   .model_id .model   .model_desc                 
#>       <int> <list>   <chr>                       
#> 1         1 <fit[+]> H2O AUTOML - STACKEDENSEMBLE

modeltime_tbl %>%
  modeltime_calibrate(test_tbl) %>%
  modeltime_forecast(
    new_data    = test_tbl,
    actual_data = data_tbl,
    keep_data   = TRUE
  ) %>%
  group_by(id) %>%
  plot_modeltime_forecast(
    .facet_ncol = 2, 
    .interactive = FALSE
  )


data_prepared_tbl <- bind_rows(train_tbl, test_tbl)

future_tbl <- data_prepared_tbl %>%
  group_by(id) %>%
  future_frame(.length_out = "1 year") %>%
  ungroup()
#> .date_var is missing. Using: Date

future_prepared_tbl <- bake(prep(recipe_spec), future_tbl)

refit_tbl <- modeltime_tbl %>%
  modeltime_refit(data_prepared_tbl)

refit_tbl %>%
  modeltime_forecast(
    new_data    = future_prepared_tbl,
    actual_data = data_prepared_tbl,
    keep_data   = TRUE
  ) %>%
  group_by(id) %>%
  plot_modeltime_forecast(
    .facet_ncol  = 2,
    .interactive = FALSE
  )
#> Converting to H2OFrame...
#> Warning: Expecting the following names to be in the data frame: .conf_hi, .conf_lo. 
#> Proceeding with '.conf_interval_show = FALSE' to visualize the forecast without confidence intervals.
#> Alternatively, try using `modeltime_calibrate()` before forecasting to add confidence intervals.

Created on 2021-04-27 by the reprex package (v1.0.0)

Session Info ``` r > devtools::session_info() ─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────── setting value version R version 4.0.2 (2020-06-22) os OS X 11.2.3 system x86_64, darwin17.0 ui RStudio language (EN) collate en_US.UTF-8 ctype en_US.UTF-8 tz America/New_York date 2021-04-27 ─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────── package * version date lib source assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2) backports 1.2.1 2020-12-09 [1] CRAN (R 4.0.2) bit 4.0.4 2020-08-04 [1] CRAN (R 4.0.2) bit64 4.0.5 2020-08-30 [1] CRAN (R 4.0.2) bitops 1.0-6 2013-08-17 [1] CRAN (R 4.0.2) broom * 0.7.5 2021-02-19 [1] CRAN (R 4.0.2) bslib 0.2.4 2021-01-25 [1] CRAN (R 4.0.2) cachem 1.0.4 2021-02-13 [1] CRAN (R 4.0.2) callr 3.5.1 2020-10-13 [1] CRAN (R 4.0.2) cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.0.2) class 7.3-18 2021-01-24 [1] CRAN (R 4.0.2) cli 2.3.1 2021-02-23 [1] CRAN (R 4.0.2) clipr 0.7.1 2020-10-08 [1] CRAN (R 4.0.2) codetools 0.2-18 2020-11-04 [1] CRAN (R 4.0.2) colorspace 2.0-0 2020-11-11 [1] CRAN (R 4.0.2) crayon 1.4.1 2021-02-08 [1] CRAN (R 4.0.2) curl 4.3 2019-12-02 [1] CRAN (R 4.0.1) data.table 1.14.0 2021-02-21 [1] CRAN (R 4.0.2) DBI 1.1.1 2021-01-15 [1] CRAN (R 4.0.2) dbplyr 2.1.0 2021-02-03 [1] CRAN (R 4.0.2) desc 1.3.0 2021-03-05 [1] CRAN (R 4.0.2) devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.2) dials * 0.0.9.9000 2020-10-13 [1] Github (tidymodels/dials@2b79300) DiceDesign 1.9 2021-02-13 [1] CRAN (R 4.0.2) digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.2) dplyr * 1.0.5 2021-03-05 [1] CRAN (R 4.0.2) ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2) evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.1) fansi 0.4.2 2021-01-15 [1] CRAN (R 4.0.2) farver 2.1.0 2021-02-28 [1] CRAN (R 4.0.2) fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.0.2) forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.0.2) foreach 1.5.1 2020-10-15 [1] CRAN (R 4.0.2) fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) furrr 0.2.2 2021-01-29 [1] CRAN (R 4.0.2) future 1.21.0 2020-12-10 [1] CRAN (R 4.0.2) generics 0.1.0 2020-10-31 [1] CRAN (R 4.0.2) ggplot2 * 3.3.3 2020-12-30 [1] CRAN (R 4.0.2) glmnet * 4.1-1 2021-02-21 [1] CRAN (R 4.0.2) globals 0.14.0 2020-11-22 [1] CRAN (R 4.0.2) glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) gower 0.2.2 2020-06-23 [1] CRAN (R 4.0.2) GPfit 1.0-8 2019-02-08 [1] CRAN (R 4.0.2) gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.2) h2o * 3.32.0.1 2020-10-17 [1] CRAN (R 4.0.2) hardhat 0.1.5 2020-11-09 [1] CRAN (R 4.0.2) haven 2.3.1 2020-06-01 [1] CRAN (R 4.0.2) highr 0.8 2019-03-20 [1] CRAN (R 4.0.2) hms 1.0.0 2021-01-13 [1] CRAN (R 4.0.2) htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.2) httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2) igraph 1.2.6 2020-10-06 [1] CRAN (R 4.0.2) infer * 0.5.4 2021-01-13 [1] CRAN (R 4.0.2) ipred 0.9-11 2021-03-12 [1] CRAN (R 4.0.2) iterators 1.0.13 2020-10-15 [1] CRAN (R 4.0.2) job 0.1 2021-04-27 [1] Github (lindeloev/job@f687bf9) jquerylib 0.1.3 2020-12-17 [1] CRAN (R 4.0.2) jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.0.2) kknn * 1.3.1 2016-03-26 [1] CRAN (R 4.0.2) knitr 1.31 2021-01-27 [1] CRAN (R 4.0.2) labeling 0.4.2 2020-10-20 [1] CRAN (R 4.0.2) lattice 0.20-41 2020-04-02 [1] CRAN (R 4.0.2) lava 1.6.9 2021-03-11 [1] CRAN (R 4.0.2) lhs 1.1.1 2020-10-05 [1] CRAN (R 4.0.2) lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.0.2) listenv 0.8.0 2019-12-05 [1] CRAN (R 4.0.2) lubridate * 1.7.10 2021-02-26 [1] CRAN (R 4.0.2) magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.2) MASS 7.3-53.1 2021-02-12 [1] CRAN (R 4.0.2) Matrix * 1.3-2 2021-01-06 [1] CRAN (R 4.0.2) memoise 2.0.0 2021-01-26 [1] CRAN (R 4.0.2) modeldata * 0.1.0 2020-10-22 [1] CRAN (R 4.0.2) modelr 0.1.8 2020-05-19 [1] CRAN (R 4.0.2) modeltime * 0.5.1.9000 2021-04-15 [1] local modeltime.h2o * 0.1.1.9000 2021-04-05 [1] local munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.2) nnet 7.3-15 2021-01-24 [1] CRAN (R 4.0.2) parallelly 1.24.0 2021-03-14 [1] CRAN (R 4.0.2) parsnip * 0.1.5 2021-01-19 [1] CRAN (R 4.0.2) pillar 1.5.1 2021-03-05 [1] CRAN (R 4.0.2) pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.0.2) pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.2) pkgload 1.2.0 2021-02-23 [1] CRAN (R 4.0.2) plyr 1.8.6 2020-03-03 [1] CRAN (R 4.0.2) prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2) pROC 1.17.0.1 2021-01-13 [1] CRAN (R 4.0.2) processx 3.4.5 2020-11-30 [1] CRAN (R 4.0.2) prodlim 2019.11.13 2019-11-17 [1] CRAN (R 4.0.2) progressr 0.7.0 2020-12-11 [1] CRAN (R 4.0.2) ps 1.6.0 2021-02-28 [1] CRAN (R 4.0.2) purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.0.2) R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.2) Rcpp 1.0.6 2021-01-15 [1] CRAN (R 4.0.2) RcppParallel 5.0.3 2021-02-24 [1] CRAN (R 4.0.2) RCurl 1.98-1.2 2020-04-18 [1] CRAN (R 4.0.2) readr * 1.4.0 2020-10-05 [1] CRAN (R 4.0.2) readxl 1.3.1 2019-03-13 [1] CRAN (R 4.0.2) recipes * 0.1.15 2020-11-11 [1] CRAN (R 4.0.2) remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2) reprex 1.0.0 2021-01-27 [1] CRAN (R 4.0.2) rlang * 0.4.10 2020-12-30 [1] CRAN (R 4.0.2) rmarkdown 2.7 2021-02-19 [1] CRAN (R 4.0.2) rpart * 4.1-15 2019-04-12 [1] CRAN (R 4.0.2) rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.0.2) rsample * 0.0.9 2021-02-17 [1] CRAN (R 4.0.2) rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.0.2) rvest 1.0.0 2021-03-09 [1] CRAN (R 4.0.2) sass 0.3.1 2021-01-24 [1] CRAN (R 4.0.2) scales * 1.1.1 2020-05-11 [1] CRAN (R 4.0.2) sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2) shape 1.4.5 2020-09-13 [1] CRAN (R 4.0.2) slider 0.1.5 2020-07-21 [1] CRAN (R 4.0.2) StanHeaders 2.21.0-7 2020-12-17 [1] CRAN (R 4.0.2) stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2) stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.0.2) styler 1.3.2 2020-02-23 [1] CRAN (R 4.0.2) survival 3.2-9 2021-03-14 [1] CRAN (R 4.0.2) testthat 3.0.2 2021-02-14 [1] CRAN (R 4.0.2) tibble * 3.1.0 2021-02-25 [1] CRAN (R 4.0.2) tidymodels * 0.1.2 2020-11-22 [1] CRAN (R 4.0.2) tidyr * 1.1.3 2021-03-03 [1] CRAN (R 4.0.2) tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.2) tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 4.0.2) timeDate 3043.102 2018-02-21 [1] CRAN (R 4.0.2) timetk * 2.6.1 2021-02-18 [1] local tune * 0.1.3 2021-02-28 [1] CRAN (R 4.0.2) usethis 2.0.1 2021-02-10 [1] CRAN (R 4.0.2) utf8 1.2.1 2021-03-12 [1] CRAN (R 4.0.2) vctrs * 0.3.6.9000 2021-02-19 [1] Github (r-lib/vctrs@9af59e9) warp 0.2.0 2020-10-21 [1] CRAN (R 4.0.2) withr 2.4.1 2021-01-26 [1] CRAN (R 4.0.2) workflows * 0.2.2 2021-03-10 [1] CRAN (R 4.0.2) workflowsets * 0.0.1 2021-03-18 [1] CRAN (R 4.0.2) xfun 0.22 2021-03-11 [1] CRAN (R 4.0.2) xml2 1.3.2 2020-04-23 [1] CRAN (R 4.0.2) xts 0.12.1 2020-09-09 [1] CRAN (R 4.0.2) yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2) yardstick * 0.0.8 2021-03-28 [1] CRAN (R 4.0.2) zoo 1.8-9 2021-03-09 [1] CRAN (R 4.0.2) [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library ```
ichsan2895 commented 3 years ago

Hello, recently, I know where the problem is...

In this example, we must select just 3 columns data_tbl <- My_DF %>% select(id, Date, Value)

If we doesn't select 3 columns (so, all columns in My_DF become data_tbl ) It will be error like I said before

Error: Problem with filter() input ..1. x object '.key' not found i Input ..1 is `.model_desc == "ACTUAL" | .key == "prediction"

Please try this one :
data_tbl <- walmart_sales_weekly %>% select(id, Date, Weekly_Sales) will be error if we remove %>% select(id, Date, Weekly_Sales)

I don't know, its bug or feature?

mdancho84 commented 3 years ago

Will need to look into this.