Closed TylerSagendorf closed 9 months ago
Still, you could have provided a reproducible toy example... But here's one:
trt = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L), class = "factor", levels = c("1",
"2", "3", "4"))
resp = structure(c(1, 29, 8, 38, 19, 25, 14, 14, 28, 23, 29, 13, 28,
13, 37, 24, 30, 15, 13, 18, 34, 12, 24, 15, 16, 25, 22, 26, 28,
22, 17, 31, 26, 21, 26, 54, 27, 39, 22, 32, 7, 55, 31, 33, 12,
30, 42, 16), dim = c(12L, 4L))
library(nnet)
mod = multinom(resp ~ trt)
# for comparison...
example(multinom) # produces fitted model 'bwt.mu', output not shown
The existing code relies on the lev
member:
> bwt.mu$lev
[1] "0" "1"
> mod$lev
NULL
What we see is that this member is NULL when the response is not a factor. I made a simple change in the code so that if $lev
is NULL
, we use a numeric sequence of the right length. Then everything works:
> ref_grid(mod)
'emmGrid' object with variables:
trt = 1, 2, 3, 4
resp = multivariate response levels: 1, 2, 3, 4
The next update will have this fix. In the meantime, you can get it to work by sourcing the following function into your workspace:
emm_basis.multinom = with(asNamespace("emmeans"),
function(object, trms, xlev, grid,
mode = c("prob", "latent"), ...) {
mode = match.arg(mode)
bhat = t(coef(object))
V = .my.vcov(object, ...)
# NOTE: entries in vcov(object) come out in same order as
# in as.numeric(bhat), even though latter has been transposed
k = ifelse(is.matrix(coef(object)), ncol(bhat), 1)
m = model.frame(trms, grid, na.action = na.pass, xlev = xlev)
X = model.matrix(trms, m, contrasts.arg = object$contrasts)
# recenter for latent predictions
pat = (rbind(0, diag(k + 1, k)) - 1) / (k + 1)
X = kronecker(pat, X)
nbasis = estimability::all.estble
nbasis = kronecker(rep(1,k), nbasis)
misc = list(tran = "log", inv.lbl = "e^y")
dfargs = list(df = object$edf)
dffun = function(k, dfargs) dfargs$df
if(is.null(ylevs <- object$lev))
ylevs = seq_len(k + 1)
ylevs = list(class = ylevs)
if (is.null(ylevs)) ylevs = list(class = seq_len(k))
names(ylevs) = as.character.default(eval(object$call$formula, environment(trms))[[2]])
misc$ylevs = ylevs
if (mode == "prob")
misc$postGridHook = .multinom.postGrid
list(X = X, bhat = as.numeric(bhat), nbasis = nbasis, V = V,
dffun = dffun, dfargs = dfargs, misc = misc)
})
PS I guess this might still be flaky with your example, if done literally, because it uses the LHS of the model formula to name the multivariate response. I suggest you don't do that, and instead save the response as a matrix.
I apologize for not providing a simple example to illustrate the issue. Thank you for taking the time to answer my question!
I sourced the code you provided and it worked fine. If I use the toy data, for example, I specify that I want estimated probabilities (default mode
) for each combination of trt
and resp
.
> (em <- emmeans(mod, ~ trt | resp, mode = "prob"))
resp = 1:
trt prob SE df lower.CL upper.CL
1 0.142 0.0214 12 0.0957 0.189
2 0.255 0.0243 12 0.2024 0.308
3 0.207 0.0246 12 0.1531 0.260
4 0.213 0.0234 12 0.1620 0.264
resp = 2:
trt prob SE df lower.CL upper.CL
1 0.292 0.0278 12 0.2315 0.353
2 0.215 0.0229 12 0.1650 0.265
3 0.240 0.0259 12 0.1833 0.296
4 0.167 0.0214 12 0.1207 0.214
resp = 3:
trt prob SE df lower.CL upper.CL
1 0.236 0.0260 12 0.1793 0.293
2 0.237 0.0237 12 0.1851 0.288
3 0.273 0.0271 12 0.2141 0.332
4 0.331 0.0269 12 0.2724 0.390
resp = 4:
trt prob SE df lower.CL upper.CL
1 0.330 0.0288 12 0.2669 0.392
2 0.293 0.0254 12 0.2375 0.348
3 0.280 0.0273 12 0.2210 0.340
4 0.289 0.0259 12 0.2320 0.345
Confidence level used: 0.95
If I were then interested in testing whether trt
2, 3, or 4 were different from trt
1, for example, I could set up treatment vs. control comparisons with contrast
. However, these results are unexpected, since it calculates differences between probabilities (e.g., 0.255 - 0.142 = 0.113).
> contrast(em, method = "trt.vs.ctrl")
resp = 1:
contrast estimate SE df t.ratio p.value
trt2 - trt1 0.113130 0.0324 12 3.492 0.0121
trt3 - trt1 0.064320 0.0326 12 1.974 0.1743
trt4 - trt1 0.070792 0.0317 12 2.231 0.1142
.
.
.
This is not an issue when I set up contrasts for a logistic regression model made with glm
, as it reports an odds.ratio
column. In order to get results in terms of odds ratios, is the following approach what you would recommend?
> em2 <- emmeans(mod, ~ trt | resp, mode = "prob")
> em2 <- regrid(em2, transform = "logit")
> contrast(em2, method = "trt.vs.ctrl", type = "response")
resp = 1:
contrast odds.ratio SE df null t.ratio p.value
trt2 / trt1 2.068 0.4485 12 1 3.348 0.0157
trt3 / trt1 1.570 0.3620 12 1 1.955 0.1796
trt4 / trt1 1.632 0.3658 12 1 2.186 0.1232
.
.
.
P value adjustment: dunnettx method for 3 tests
Tests are performed on the log odds ratio scale
NOTE: if resp
was only two columns, I could perform logistic regression with glm(..., family = binomial)
and compare it to the multinom
equivalent. The results are identical if I use the regrid
approach demonstrated above, aside from the degrees of freedom, confidence intervals, and associated p-values (CIs are much more narrow for multinom
models), which I assume has some theoretical underpinnings that I am yet unfamiliar with.
Yes. Regridding to the logit scale is exactly what I'd recommend if you want odds ratios.
Russ
Closing this issue as completed
Is it possible to use
emmeans
withnnet::multinom
when the LHS of the formula passed tomultinom
is a matrix of counts? For example:I currently find myself in this situation, though I can not show the data I am working with. It appears that this situation is not supported, based on what I could find in
vignette("models", "emmeans")
. If I'm not mistaken, are there plans to support this in the future?