Closed meenakshi-kushwaha closed 3 years ago
One thing to note is that you need to have the first argument be either a model or a workflow, like the error message say, but actually something is not working right, even when that is done correctly:
library(tidymodels)
library(multilevelmod)
data(sleepstudy, package = "lme4")
set.seed(345)
sleep_folds <- vfold_cv(sleepstudy, group = Subject, v = 3)
sleep_folds
#> # 3-fold cross-validation
#> # A tibble: 3 x 2
#> splits id
#> <list> <chr>
#> 1 <split [120/60]> Fold1
#> 2 <split [120/60]> Fold2
#> 3 <split [120/60]> Fold3
mixed_model_spec <- linear_reg() %>% set_engine("lmer")
mixed_model_wf <- workflow() %>%
add_model(mixed_model_spec, formula = Reaction ~ Days + (Days | Subject)) %>%
add_variables(outcomes = Reaction, predictors = c(Days, Subject))
## workflow will fit one time just fine
fit(mixed_model_wf, sleepstudy)
#> ══ Workflow [trained] ══════════════════════════════════════════════════════════
#> Preprocessor: Variables
#> Model: linear_reg()
#>
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> Outcomes: Reaction
#> Predictors: c(Days, Subject)
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: Reaction ~ Days + (Days | Subject)
#> Data: data
#> REML criterion at convergence: 1743.628
#> Random effects:
#> Groups Name Std.Dev. Corr
#> Subject (Intercept) 24.741
#> Days 5.922 0.07
#> Residual 25.592
#> Number of obs: 180, groups: Subject, 18
#> Fixed Effects:
#> (Intercept) Days
#> 251.41 10.47
## workflow will *not* fit to resamples
fit_resamples(mixed_model_wf, sleep_folds)
#>
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#>
#> %@%, as_function, flatten, flatten_chr, flatten_dbl, flatten_int,
#> flatten_lgl, flatten_raw, invoke, list_along, modify, prepend,
#> splice
#>
#> Attaching package: 'vctrs'
#> The following object is masked from 'package:tibble':
#>
#> data_frame
#> The following object is masked from 'package:dplyr':
#>
#> data_frame
#> Loading required package: Matrix
#>
#> Attaching package: 'Matrix'
#> The following objects are masked from 'package:tidyr':
#>
#> expand, pack, unpack
#> x Fold1: preprocessor 1/1, model 1/1 (predictions): Error in if (remove_intercept...
#> x Fold2: preprocessor 1/1, model 1/1 (predictions): Error in if (remove_intercept...
#> x Fold3: preprocessor 1/1, model 1/1 (predictions): Error in if (remove_intercept...
#> Warning: All models failed. See the `.notes` column.
#> Error in glue_data(.x = NULL, ..., .sep = .sep, .envir = .envir, .open = .open, : Expecting '}'
Created on 2020-12-04 by the reprex package (v0.3.0.9001)
I walked through this today, and the problem happens at prediction time within tune::fit_resamples()
(try debugonce(predict_model)
) at this point:
Browse[2]> predict(model, x_vals, type = type_iter)
Error in if (remove_intercept & any(grepl("Intercept", names(new_data)))) { :
argument is of length zero
Backtrace:
1. tune::fit_resamples(mixed_model_wf, sleep_folds)
21. tune:::safely_iterate(...) R/grid_code_paths.R:344:2
27. tune:::fn(...) R/grid_code_paths.R:414:4
36. tune::predict_model(split, workflow, iter_grid, metrics, iter_submodels) R/grid_code_paths.R:282:6
38. parsnip::predict.model_fit(model, x_vals, type = type_iter)
40. parsnip::predict_numeric.model_fit(...)
41. parsnip::prepare_data(object, new_data)
And actually, this model seems to have a hard time predicting at all:
library(tidymodels)
library(multilevelmod)
data(sleepstudy, package = "lme4")
set.seed(345)
sleep_folds <- vfold_cv(sleepstudy, group = Subject, v = 3)
sleep_folds
#> # 3-fold cross-validation
#> # A tibble: 3 x 2
#> splits id
#> <list> <chr>
#> 1 <split [120/60]> Fold1
#> 2 <split [120/60]> Fold2
#> 3 <split [120/60]> Fold3
mixed_model_spec <- linear_reg() %>% set_engine("lmer")
mixed_model_wf <- workflow() %>%
add_model(mixed_model_spec, formula = Reaction ~ Days + (Days | Subject)) %>%
add_variables(outcomes = Reaction, predictors = c(Days, Subject))
## workflow will fit one time just fine
mixed_fit <- fit(mixed_model_wf, sleepstudy)
mixed_fit
#> ══ Workflow [trained] ══════════════════════════════════════════════════════════
#> Preprocessor: Variables
#> Model: linear_reg()
#>
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> Outcomes: Reaction
#> Predictors: c(Days, Subject)
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: Reaction ~ Days + (Days | Subject)
#> Data: data
#> REML criterion at convergence: 1743.628
#> Random effects:
#> Groups Name Std.Dev. Corr
#> Subject (Intercept) 24.741
#> Days 5.922 0.07
#> Residual 25.592
#> Number of obs: 180, groups: Subject, 18
#> Fixed Effects:
#> (Intercept) Days
#> 251.41 10.47
## will *not* predict
predict(mixed_fit, sleepstudy[2,])
#> Error in if (remove_intercept & any(grepl("Intercept", names(new_data)))) {: argument is of length zero
Created on 2020-12-08 by the reprex package (v0.3.0.9001)
With the current development version of multilevelmod, this does work. 🎉
library(tidymodels)
library(multilevelmod)
data(sleepstudy, package = "lme4")
set.seed(345)
sleep_folds <- group_vfold_cv(sleepstudy, group = Subject, v = 3)
sleep_folds
#> # Group 3-fold cross-validation
#> # A tibble: 3 x 2
#> splits id
#> <list> <chr>
#> 1 <split [120/60]> Resample1
#> 2 <split [120/60]> Resample2
#> 3 <split [120/60]> Resample3
mixed_model_spec <- linear_reg() %>% set_engine("lmer")
mixed_model_wf <- workflow() %>%
add_model(mixed_model_spec, formula = Reaction ~ Days + (Days | Subject)) %>%
add_variables(outcomes = Reaction, predictors = c(Days, Subject))
fit(mixed_model_wf, sleepstudy)
#> ══ Workflow [trained] ══════════════════════════════════════════════════════════
#> Preprocessor: Variables
#> Model: linear_reg()
#>
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> Outcomes: Reaction
#> Predictors: c(Days, Subject)
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: Reaction ~ Days + (Days | Subject)
#> Data: data
#> REML criterion at convergence: 1743.628
#> Random effects:
#> Groups Name Std.Dev. Corr
#> Subject (Intercept) 24.741
#> Days 5.922 0.07
#> Residual 25.592
#> Number of obs: 180, groups: Subject, 18
#> Fixed Effects:
#> (Intercept) Days
#> 251.41 10.47
fit_resamples(mixed_model_wf, sleep_folds)
#>
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#>
#> %@%, as_function, flatten, flatten_chr, flatten_dbl, flatten_int,
#> flatten_lgl, flatten_raw, invoke, list_along, modify, prepend,
#> splice
#>
#> Attaching package: 'vctrs'
#> The following object is masked from 'package:tibble':
#>
#> data_frame
#> The following object is masked from 'package:dplyr':
#>
#> data_frame
#> Loading required package: Matrix
#>
#> Attaching package: 'Matrix'
#> The following objects are masked from 'package:tidyr':
#>
#> expand, pack, unpack
#> # Resampling results
#> # Group 3-fold cross-validation
#> # A tibble: 3 x 4
#> splits id .metrics .notes
#> <list> <chr> <list> <list>
#> 1 <split [120/60]> Resample1 <tibble [2 × 4]> <tibble [0 × 1]>
#> 2 <split [120/60]> Resample2 <tibble [2 × 4]> <tibble [0 × 1]>
#> 3 <split [120/60]> Resample3 <tibble [2 × 4]> <tibble [0 × 1]>
Created on 2020-12-10 by the reprex package (v0.3.0.9001)
You can install this via:
devtools::install_github("tidymodels/multilevelmod")
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
Hello, I am trying to perform k-fold cross validation on an lmer (linear mixed effects) model using this package and tidymodels method. But, i keep getting the following error with the fit_resamples() function.
"Error: The first argument to [fit_resamples()] should be either a model or workflow."
Following is a reprex