rvlenth / emmeans

Estimated marginal means
https://rvlenth.github.io/emmeans/
340 stars 30 forks source link

Passing a model to emmeans() made from passing a formula to that model generates an error, whereas a model with the formula hand-coded does not. #495

Closed ProfessorPeregrine closed 2 weeks ago

ProfessorPeregrine commented 2 weeks ago

When I create a new model (in this case a gls) I'd like to pass in a formula from a previous model (in this case lm). It seems like I could use formula(old_model) and if I do, the gls function works perfectly. But if I pass in that new gls model to emmeans() it fails with this error: Error in call$model[[2]] : object of type 'symbol' is not subsettable. On the other hand, if I create that new gls model by hand-typing the formula, that model works fine with emmeans().

Here is code to replicate the issue:

require(emmeans)
require(nlme)

data<-structure(list(percent = c(2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 
                                 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 
                                 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 
                                 2, 2, 1, 1, 1, 1, 1, 1, 1, 1),
                     cycle = c(4, 4, 4, 4, 4, 4, 4, 
                               4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
                               3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 
                               1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1),
                     density = c(21, 
                                 20, 21, 21, 22, 20, 20, 19, 18, 19, 19, 19, 19, 18, 18, 19, 20, 
                                 22, 22, 21, 22, 22, 20, 21, 20, 20, 19, 19, 19, 19, 19, 19, 23, 
                                 25, 24, 24, 26, 23, 24, 24, 23, 22, 22, 23, 21, 24, 22, 22, 16, 
                                 17, 16, 15, 17, 16, 16, 16, 26, 27, 26, 25, 25, 26, 26, 26), 
                     run = c(8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 4L, 4L, 4L, 4L, 4L, 
                             4L, 4L, 4L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 3L, 3L, 3L, 3L, 
                             3L, 3L, 3L, 3L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 2L, 2L, 2L, 
                             2L, 2L, 2L, 2L, 2L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 
                             1L, 1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA, 
                                                                                           -64L))

formula<-"density~percent*cycle" #selected by user
effect<-"percent*cycle" #selected by user#
data$cycle<-factor(data$cycle)
data$percent<-factor(data$percent)

aov_mod <-lm(formula = formula(formula),data = data)

#later in the app
data$group<-interaction(data[c(1,2)])
gls_formula<-formula(aov_mod)
model<-gls(gls_formula, data = data, weights = varIdent(form = ~1 | group)) #using extracted formula#
model2<-gls(density~percent*cycle, data = data, weights = varIdent(form = ~1 | group)) #using hard coded formula#
emmeans(model,
        formula(paste("pairwise ~",gsub(pattern = ":",replacement = "*",x = effect))),
        adjust = "tukey",
        type = "response",
        data=data)
emmeans(model2,
        formula(paste("pairwise ~",gsub(pattern = ":",replacement = "*",x = effect))),
        adjust = "tukey",
        type = "response",
        data=data)

Expected behavior

emmeans() should take a valid model regardless of whether that model was created with a hand-typed or extracted formula.

Additional context

I posted this on StackOverflow:https://stackoverflow.com/questions/78664364/why-do-i-need-to-hardcode-the-formula-in-emmeans-instead-of-dynamically-extrac

The responder mentioned this code chunk for a clue on what might be happening: https://github.com/rvlenth/emmeans/blob/48ab8009c1363c3fd22293b49f1841e61bb789ed/R/helpers.R#L343-L348

Thank you very much for a great package! I have the workaround, but I hope this helps make the package better!

rvlenth commented 2 weeks ago

The problem is not there; for one thing, data is not NULL; and you can check it:

> all.vars(eval(model$call[[2]]))
[1] "density" "percent" "cycle"  

But there is a similar place -- https://github.com/rvlenth/emmeans/blob/48ab8009c1363c3fd22293b49f1841e61bb789ed/R/helpers.R#L419

... where an extra eval is needed. With that change, which I will push up soon, it works correctly.

Thanks for reporting this.

rvlenth commented 2 weeks ago

Closing as I think this is resolved. BTW, the bug is actually in the Satterthwaite code, so if you had used mode = "asymp" or mode = "appx-satt", you wouldn't see this bug.