business-science / modeltime

Modeltime unlocks time series forecast models and machine learning in one framework
https://business-science.github.io/modeltime/
Other
537 stars 82 forks source link

Error in `add_model()`: ! `spec` must have a known mode. #205

Closed forecastingEDs closed 2 years ago

forecastingEDs commented 2 years ago

Hello @mdancho84

The modeltime package is showing a new error. Follows the script executed is the same execution with the example of walmart [h](https://business-science.github.io/modeltime.resample/articles/panel-data.html sales data, however with my data. Please can you help?

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
remove.packages("modeltime", lib="~/R/win-library/4.1") 
install.packages("modeltime", dependencies = TRUE)
remotes::install_github("business-science/modeltime", dependencies = TRUE)

Enter one or more numbers, or an empty line to skip updates: 2: CRAN packages only

Load the following R packages

library(keras) 
library(ForecastTB) 
library(ggplot2)
library(zoo)
library(forecast)
library(lmtest)
library(urca)
library(stats)
library(nnfor)
library(forecastHybrid)
library(pastecs)
library(forecastML)
library(Rcpp)
library(modeltime.ensemble)
library(tidymodels)
library(modeltime)
library(lubridate)
library(tidyverse)
library(tidymodels)
library(modeltime.resample)
library(timetk)
library(tidyquant)
library(modeltime.h2o)
library(yardstick)
library(reshape)
library(plotly)
library(xgboost)
library(rsample)
library(targets)
library(modeltime.gluonts)
library(tidymodels)
library(modeltime)
library(modeltime.resample)
library(timetk)
library(tidyverse)
library(tidyquant)
library(LiblineaR)
library(parsnip)
library(ranger)
library(kknn)
library(readxl)
library(lifecycle)
library(skimr) 

data_tbl <- atends_temperature_calendar %>%
  select(id, Date, attendences, average_temperature, min, max) %>%
  set_names(c("id", "date", "value","temperature", "tempemin", "tempemax"))
data_tbl

Full = Training + Forecast Datasets

full_data_tbl <- atends_temperature_calendar %>%
  select(id, Date, attendences, average_temperature, min, max) %>%
  set_names(c("id", "date", "value","temperature", "tempemin", "tempemax")) %>%

Apply Group-wise Time Series Manipulations

group_by(id) %>%
  future_frame(
    .date_var   = date,
    .length_out = "7 days",
    .bind_data  = TRUE
  ) %>%
  ungroup() %>%

Consolidate IDs

mutate(id = fct_drop(id))

Training Data

data_prepared_tbl <- full_data_tbl %>%
  filter(!is.na(value))

Forecast Data

future_tbl <- full_data_tbl %>%
  filter(is.na(value))

Data Splitting ----

Now we set aside the future data (we would only need that later when we make forecast)

And focus on training data

* 4.1 Panel Data Splitting ----

Split the dataset into analyis/assessment set

emergency_tscv <- data_prepared_tbl %>%
  time_series_cv(
    date_var    = date, 
    assess      = "7 days",
    skip        = "60 days",
    cumulative  = TRUE,
    slice_limit = 5
  )

emergency_tscv

recipe_spec <- recipe(value ~ ., 
                      data = training(emergency_tscv$splits[[1]])) %>%
  step_timeseries_signature(date) %>%
  step_rm(matches("(.iso$)|(.xts$)|(day)|(hour)|(minute)|(second)|(am.pm)")) %>%
  step_mutate(data = factor(value, ordered = TRUE)) %>%
  step_dummy(all_nominal(), one_hot = TRUE)

Until this step pre-processing datethe script runs normally, but when I go to train the machine learning models it reports the following error:

Radial Basis Function Support Vector Machine Model 1: SVM_rbf ----

wflw_fit_svm_rbf <- workflow() %>%
  add_model(
    svm_rbf() %>% set_engine("kernlab") 
  ) %>%
  add_recipe(recipe_spec %>% step_rm(date)) %>%
  fit(training(emergency_tscv$splits[[1]]))

Error in add_model(): ! spec must have a known mode. ℹ Set the mode of spec by using parsnip::set_mode() or by setting the mode directly in the parsnip specification function. Run rlang::last_error() to see where the error occurred.

The same error does not occur for the other models, that is, this error is only occurring for ML models: Random Forest, Xgboost, SVM_linear, svm_rbf.

until a few days ago, ML algorithms were working normally with this code:

Model 2: Xgboost ----

wflw_fit_xgboost <- workflow() %>%
  add_model(
    boost_tree() %>% set_engine("xgboost") 
  ) %>%
  add_recipe(recipe_spec %>% step_rm(date)) %>%
  fit(training(emergency_tscv$splits[[1]]))

Statistical models run normally:

Model 10: ets ----

wflw_fit_ets <- workflow() %>% 
  add_model(
    exp_smoothing() %>%
      set_engine(engine = "ets")) %>%
  add_recipe(recipe_spec) %>%
  fit(training(emergency_tscv$splits[[1]]))

---- NAIVE ----

Model 11: Naive ----

wflw_fit_naive <- workflow() %>% 
  add_model(
    naive_reg() %>%
      set_engine(engine = "naive")) %>%
  add_recipe(recipe_spec) %>%
  fit(training(emergency_tscv$splits[[1]]))

Model 12: sNaive ----

wflw_fit_SNAIVE <- workflow() %>% 
  add_model(
    naive_reg() %>%
      set_engine(engine = "snaive")) %>%
  add_recipe(recipe_spec) %>%
  fit(training(emergency_tscv$splits[[1]]))
mdancho84 commented 2 years ago

This is due to the update to workflows, which now requires you to specify the mode = "regression"

Models like:

Allow both regression & classification.

Solution is to change their mode = "regression"

forecastingEDs commented 2 years ago

This is due to the update to workflows, which now requires you to specify the mode = "regression"

Models like:

  • svm_rbf()
  • boost_tree()

Allow both regression & classification.

Solution is to change their mode = "regression"

Dear @mdancho84

You solved my problem. Thank you very much for the information!