Closed cimentadaj closed 4 years ago
It looks like you are using this in combination with workflows / parsnip in https://github.com/cimentadaj/tidyflow/issues/15
hardhat's mold function just uses basic model.matrix()
infrastructure with a few tweaks. It won't handle "special" formula syntax like multilevel formulas because there are just too many types to support and it would rely on too many other packages.
The correct way to pass the formula through when you are using it with workflows is to pass it through add_model(formula = )
. Then you use add_formula()
or add_recipe()
to only specify terms.
It is currently easiest to do this with add_recipe()
because add_formula()
will try and expand all factors to dummy variables for multilevel models unless you specify indicators = "none"
in the blueprint. This will get more intuitive with add_variables(y = Reaction, x = c(Days, Subject))
in https://github.com/tidymodels/workflows/issues/34
library(hardhat)
library(lme4)
library(multilevelmod)
library(workflows)
library(recipes)
data(sleepstudy, package = "lme4")
mixed_model_spec <- linear_reg() %>% set_engine("lmer")
# //////////////////////////////////////////////////////////////////////////////
# Tell hardhat not to expand factors into dummy variables
bp <- default_formula_blueprint(indicators = "none")
# We just use `add_formula()` to specify terms, then `add_model()` contains
# the real "model formula"
wf <- workflow() %>%
add_formula(Reaction ~ Days + Subject, blueprint = bp) %>%
add_model(mixed_model_spec, formula = Reaction ~ Days + (Days | Subject))
fit(wf, sleepstudy)
#> ══ Workflow [trained] ══════════════════════════════════════════════════════════
#> Preprocessor: Formula
#> Model: linear_reg()
#>
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> Reaction ~ Days + Subject
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: Reaction ~ Days + (Days | Subject)
#> Data: data
#> REML criterion at convergence: 1743.628
#> Random effects:
#> Groups Name Std.Dev. Corr
#> Subject (Intercept) 24.741
#> Days 5.922 0.07
#> Residual 25.592
#> Number of obs: 180, groups: Subject, 18
#> Fixed Effects:
#> (Intercept) Days
#> 251.41 10.47
# //////////////////////////////////////////////////////////////////////////////
# 0-step recipe to just specify terms
rec <- recipe(Reaction ~ Days + Subject, sleepstudy)
wf2 <- workflow() %>%
add_recipe(rec) %>%
add_model(mixed_model_spec, formula = Reaction ~ Days + (Days | Subject))
fit(wf2, sleepstudy)
#> ══ Workflow [trained] ══════════════════════════════════════════════════════════
#> Preprocessor: Recipe
#> Model: linear_reg()
#>
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> 0 Recipe Steps
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: Reaction ~ Days + (Days | Subject)
#> Data: data
#> REML criterion at convergence: 1743.628
#> Random effects:
#> Groups Name Std.Dev. Corr
#> Subject (Intercept) 24.741
#> Days 5.922 0.07
#> Residual 25.592
#> Number of obs: 180, groups: Subject, 18
#> Fixed Effects:
#> (Intercept) Days
#> 251.41 10.47
add_variables()
is in the development version of workflows. Otherwise I don't think there is much else to do here
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
Hi!
Thanks for
hardhat
and your valuable work for the community!I'm interesting in integrating
hardhat
with multilevel models but I'm not sure the current behavior ofhardhat
is expected. I've read the vignette and I see that it supports formulas such ascbind(y1, y2) ~ x1
appropriately but it doesn't know how to handle multilevel syntax. It creates new variables for(Days | Subject)
when in fact they should be normal predictors. See reprex:lme4
has support for extracing multilevel notation, so perhaps it could be integrated vialme4
. For example:lme4::findbars(Reaction ~ Days + (Days | Subject))
.