tidyverts / fable

Tidy time series forecasting
https://fable.tidyverts.org
GNU General Public License v3.0
559 stars 65 forks source link

Support for formula dot (.) syntax #195

Closed wkdavis closed 4 years ago

wkdavis commented 5 years ago

Does fable support the use of the dot (.) akin to its use in formula() to include all other variables? Here is an example that returns an error.

library(dplyr)
library(tsibble)
library(fabletools)
library(fable)
library(lubridate)
library(fpp2)

insurance %>% 
  mutate(Exreg1 = rnorm(n = nrow(.))) %>% 
  model(
    arima = ARIMA(Quotes ~ pdq(d = 0) + .)
  )

# A mable: 1 x 1
#  arima       
#  <model>     
# 1 <NULL model>
# Warning message:
# 1 error encountered for arima
# [1] '.' in formula and no 'data' argument

Based on the error message it looks like the function has some idea of what I am trying to do, in that it recognizes the . and is looking for data to which the . can be applied. However, the models in fable don't have a data argument. I looked at the source code for fabletools::model() and fabletools::estimate() to see if/where the formula and data are brought together, but I couldn't find a solution.

mitchelloharawild commented 5 years ago

fable does not currently support the . interface for model building. At this point I'm not planning on adding the ., instead opting for more of a recipes style of changing the model specification.

This is partially because semantically . is also used for %>%, and the difference between . in the formula and . in the model specification could be confusing. Further, . has some surprising behaviour with the na.action used when variables are removed with -.

wkdavis commented 5 years ago

Thank you, that makes sense. Is the recipes interface supported at this time? Essentially, I am looking to include all variables in a dataframe in the model without typing each one explicitly.

mitchelloharawild commented 5 years ago

Not supported yet, not sure when this will be added. For now you can definitely add all variables from a dataframe by constructing the formula programmatically.