tidymodels / multilevelmod

Parsnip wrappers for mixed-level and hierarchical models
https://multilevelmod.tidymodels.org/
Other
74 stars 3 forks source link

bug in when using there is mismatch between number of labels in training and testing data set. #38

Closed EmilHvitfeldt closed 2 years ago

EmilHvitfeldt commented 2 years ago

I found this problem in https://stackoverflow.com/questions/72349462/r-using-a-lmer-model-in-fit-resamples-fails-with-error-assigned-data-facto and condensed it further down, to this:

@hfrick:

that seems weird. multilevelmod::reformat_lme_pred_data() breaks because there are no factor levels. I quickly checked if it makes a difference if you turn manufacturer and model into factors before you do anything else but that’s not it.

library(tidyverse)
library(tidymodels)
library(multilevelmod)

data(mpg, package = "ggplot2")

lmm_model = linear_reg() %>% 
  set_engine("lmer")

lmm_workflow = workflow() %>% 
  add_variables(outcomes = cty,
                predictors = c(year, manufacturer, model)) %>% 
  add_model(lmm_model, formula = cty ~ year + (1|manufacturer/model))

# A simple fit works
lmm_fit <- fit(lmm_workflow, mpg)

predict(lmm_fit, new_data = mpg)
#> Error:
#> ! Assigned data `factor(lvl[1], levels = lvl)` must be compatible with existing data.
#> ✖ Existing data has 234 rows.
#> ✖ Assigned data has 0 rows.
#> ℹ Only vectors of size 1 are recycled.
a-difabio commented 2 years ago

It looks like to me, that the only way to make an lmer model predict on new factor levels is to specify allow.new.levels = TRUE in the predict() call. However, this is not possible when calling predict() on a Workflow object.

# This also works; see ?lme4::predict.merMod
predict(lmm_fit %>% extract_fit_engine(), newdata = mpg, allow.new.levels = TRUE)

predict(lmm_fit, new_data = mpg, allow.new.levels = TRUE)
#> Error: The ellipses are not used to pass args to the model function's predict function. These arguments cannot be used: `allow.new.levels`
a-difabio commented 2 years ago

I think my last message was wrong: I have just now seen that allow.new.levels = TRUE is set by default in the definition of the make_lme4_linear_reg() function, so I guess that it is not actually the source of the error.

hfrick commented 2 years ago

closed in #41

github-actions[bot] commented 2 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.