business-science / modeltime

Modeltime unlocks time series forecast models and machine learning in one framework
https://business-science.github.io/modeltime/
Other
522 stars 79 forks source link

Auto.arima fit function #11

Closed MislavSag closed 4 years ago

MislavSag commented 4 years ago

I was inspecting your package today. In first example, you use auto.arima model:

# Model 1: auto_arima ----
model_fit_arima_no_boost <- arima_reg() %>%
    set_engine(engine = "auto_arima") %>%
    fit(value ~ date, data = training(splits))

I don't understand why do you have formula input in the fit function when auto.arima function from forecast package doesn't have fornula input. It has only univariate series (y) argument. It's confusing for me to understand how formula is converted to the main function arguments.

mdancho84 commented 4 years ago

Hi @MislavSag ,

The date column is a requirement of modeltime functions to standardize the input & output for time-based models. The formula fit(value ~ date) has less to do with auto.arima and more to do with the interface of tidymodels, which uses a formula-style. It also helps internally to standardize the output of modeltime models, which include training data along with residuals and the time-based dependency column (date) in this example. So structurally, I found it important so I made the design decision to require it. It also makes it consistent with models like prophet_reg(), which require a date column.

So in summary, you are correct in that Arima models are sequential and have no relationship to a time column. However, from a design perspective, it's important to have a structure that is consistent across all modeltime objects, and the date column is needed for that.

Hope this helps.

-Matt

MislavSag commented 4 years ago

For me, formula y~date looks like a date column is a dummy variable. For every model I have to minimally include y ~ date, but for some models, it can also include dates as factors and other covariates?

mdancho84 commented 4 years ago

Closing this - I cannot change the formula convention without breaking the package.