Open mdancho84 opened 3 years ago
The tidymodels
team is working on a new R package multilevelmod
, which covers linear mixed-effects models from lmer
and other associated R packages. The formula structure is similar to mcgv::gam
in that there are modifiers that get passed through a formula interface.
This is a good one to check out.
linear_reg()
engines: https://github.com/tidymodels/multilevelmod/blob/master/R/linear_reg_data.RHi @mdancho84
I am not sure if we need this because we just use common formulas + functions inside the formula, but our user can use the function equally like in the mgcv::gam() function. In principle I do not see it necessary. Unless we are going to use an engine that uses a formulation of the style that these packages use.
Hi @BenWynne-Morris ,
I need you to do some checks on the predict function for gams. I need you to compare this with what a linear regression would give to see if the gams can somehow give confidence intervals. As you can see, with the code below you will see that linear regression produces confidence intervals while gams do not. We need to know if there is any way to get them.
predict(model_fit_gam, newdata = training(splits), type = 'response', interval = "confidence")
predict(model_fit_lm, newdata = training(splits), type = 'response', interval = "confidence")
We also need to compare the following code and see if there is something similar for gams (or an explanation of why the results are so different between gam and lm):
predict(model_fit_gam, newdata = training(splits), type = 'response', interval = "prediction")
predict(model_fit_lm, newdata = training(splits), type = 'response', interval = "prediction")
Regards,
Thinking through this:
A gam model for regression looks like this:
model_fit_gam <- gam(
formula = value ~ s(date_month, k = 12) + s(date_num) + s(lag_24) + s(date_num, date_month),
family = Gamma(link="log"),
method = "REML",
data = training(splits)
)
Phase 1 - Get this working.
model_fit_gam <- gam_mod(mode = "regression") %>%
set_engine("gam", family=Gamma(link="log"), method = "REML") %>%
fit(value ~ s(date_month, k = 12) + s(date_num) + s(lag_24) + s(date_num, date_month), data = training(splits))
There are 2 interfaces: formula and recipes.
Definitely a phase 2 item, but this might be useful. Would take some serious thinking about how we'd want to implement for gams.
library(tidymodels)
library(gamsnip)
library(modeltime)
library(tidyverse)
library(timetk)
m750_extended <- m750 %>%
group_by(id) %>%
future_frame(.length_out = 24, .bind_data = TRUE) %>%
mutate(lag_24 = lag(value, 24)) %>%
ungroup()
m750_train <- m750_extended %>% drop_na()
recipe_spec <- recipe(value ~ date + lag_24, m750_train) %>%
step_mutate(date_num = as.numeric(date)) %>%
step_mutate(date_mon = lubridate::month(date))%>%
step_rm(date) %>%
step_interact(terms = ~ date_num * date_mon)
recipe_spec %>% prep() %>% juice()
# A tibble: 282 x 5
#> lag_24 value date_num date_mon date_num_x_date_mon
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 6370 7030 8035 1 8035
#> 2 6430 7170 8066 2 16132
#> 3 6520 7150 8095 3 24285
#> 4 6580 7180 8126 4 32504
#> 5 6620 7140 8156 5 40780
#> 6 6690 7100 8187 6 49122
#> 7 6000 6490 8217 7 57519
#> 8 5450 6060 8248 8 65984
#> 9 6480 6870 8279 9 74511
#> 10 6820 6880 8309 10 83090
#> # … with 272 more rows
# Possible new step_gam_* functions
recipe_spec_gam <- recipe_spec %>%
step_gam_smooth(date_mon, k = 12) %>%
step_gam_smooth(lag_24, date_num, date_num_x_date_mon, method = "REML")
Develop
gen_additive_mod()
algorithm with modes "regression" and "classification"parsnip::linear_reg()
as an example.multilevelmod::linear_reg()
as an example:linear_reg()
engines: https://github.com/tidymodels/multilevelmod/blob/master/R/linear_reg_data.Rparsnip
andworkflows
interfacesSee business-science/modeltime#71 for Discussion and Basic Example