Is there support for `nnet::multinom` when the LHS of the formula is a matrix of counts?

TylerSagendorf commented 10 months ago

Is it possible to use emmeans with nnet::multinom when the LHS of the formula passed to multinom is a matrix of counts? For example:

library(nnet)
library(emmeans)

# col1, col2, and col3 are columns of counts
# more specifically, this is compositional data, so the response columns are correlated.
fit1 <- multinom(cbind(col1, col2, col3, col4) ~ (x1 + x2 + x3)^3, ...)

emmeans(fit1, ...) # this fails

I currently find myself in this situation, though I can not show the data I am working with. It appears that this situation is not supported, based on what I could find in vignette("models", "emmeans"). If I'm not mistaken, are there plans to support this in the future?

rvlenth commented 10 months ago

Still, you could have provided a reproducible toy example... But here's one:

trt = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L), class = "factor", levels = c("1", 
"2", "3", "4"))
resp = structure(c(1, 29, 8, 38, 19, 25, 14, 14, 28, 23, 29, 13, 28, 
13, 37, 24, 30, 15, 13, 18, 34, 12, 24, 15, 16, 25, 22, 26, 28, 
22, 17, 31, 26, 21, 26, 54, 27, 39, 22, 32, 7, 55, 31, 33, 12, 
30, 42, 16), dim = c(12L, 4L))

library(nnet)
mod = multinom(resp ~ trt)

# for comparison...
example(multinom)   # produces fitted model 'bwt.mu', output not shown

The existing code relies on the lev member:

> bwt.mu$lev
[1] "0" "1"
> mod$lev
NULL

What we see is that this member is NULL when the response is not a factor. I made a simple change in the code so that if $lev is NULL, we use a numeric sequence of the right length. Then everything works:

> ref_grid(mod)
'emmGrid' object with variables:
    trt = 1, 2, 3, 4
    resp = multivariate response levels: 1, 2, 3, 4

The next update will have this fix. In the meantime, you can get it to work by sourcing the following function into your workspace:

emm_basis.multinom = with(asNamespace("emmeans"),
        function(object, trms, xlev, grid, 
                              mode = c("prob", "latent"), ...) {
    mode = match.arg(mode)
    bhat = t(coef(object))
    V = .my.vcov(object, ...)
    # NOTE: entries in vcov(object) come out in same order as
    # in as.numeric(bhat), even though latter has been transposed
    k = ifelse(is.matrix(coef(object)), ncol(bhat), 1)
    m = model.frame(trms, grid, na.action = na.pass, xlev = xlev)
    X = model.matrix(trms, m, contrasts.arg = object$contrasts)
    # recenter for latent predictions
    pat = (rbind(0, diag(k + 1, k)) - 1) / (k + 1)
    X = kronecker(pat, X)
    nbasis = estimability::all.estble
    nbasis = kronecker(rep(1,k), nbasis)
    misc = list(tran = "log", inv.lbl = "e^y")
    dfargs = list(df = object$edf)
    dffun = function(k, dfargs) dfargs$df
    if(is.null(ylevs <- object$lev))
        ylevs = seq_len(k + 1)
    ylevs = list(class = ylevs)
    if (is.null(ylevs)) ylevs = list(class = seq_len(k))
    names(ylevs) = as.character.default(eval(object$call$formula, environment(trms))[[2]])
    misc$ylevs = ylevs
    if (mode == "prob")
        misc$postGridHook = .multinom.postGrid
    list(X = X, bhat = as.numeric(bhat), nbasis = nbasis, V = V, 
         dffun = dffun, dfargs = dfargs, misc = misc)
})

rvlenth commented 10 months ago

PS I guess this might still be flaky with your example, if done literally, because it uses the LHS of the model formula to name the multivariate response. I suggest you don't do that, and instead save the response as a matrix.

TylerSagendorf commented 10 months ago

I apologize for not providing a simple example to illustrate the issue. Thank you for taking the time to answer my question!

I sourced the code you provided and it worked fine. If I use the toy data, for example, I specify that I want estimated probabilities (default mode) for each combination of trt and resp.

> (em <- emmeans(mod, ~ trt | resp, mode = "prob"))
resp = 1:
 trt  prob     SE df lower.CL upper.CL
 1   0.142 0.0214 12   0.0957    0.189
 2   0.255 0.0243 12   0.2024    0.308
 3   0.207 0.0246 12   0.1531    0.260
 4   0.213 0.0234 12   0.1620    0.264

resp = 2:
 trt  prob     SE df lower.CL upper.CL
 1   0.292 0.0278 12   0.2315    0.353
 2   0.215 0.0229 12   0.1650    0.265
 3   0.240 0.0259 12   0.1833    0.296
 4   0.167 0.0214 12   0.1207    0.214

resp = 3:
 trt  prob     SE df lower.CL upper.CL
 1   0.236 0.0260 12   0.1793    0.293
 2   0.237 0.0237 12   0.1851    0.288
 3   0.273 0.0271 12   0.2141    0.332
 4   0.331 0.0269 12   0.2724    0.390

resp = 4:
 trt  prob     SE df lower.CL upper.CL
 1   0.330 0.0288 12   0.2669    0.392
 2   0.293 0.0254 12   0.2375    0.348
 3   0.280 0.0273 12   0.2210    0.340
 4   0.289 0.0259 12   0.2320    0.345

Confidence level used: 0.95

If I were then interested in testing whether trt 2, 3, or 4 were different from trt 1, for example, I could set up treatment vs. control comparisons with contrast. However, these results are unexpected, since it calculates differences between probabilities (e.g., 0.255 - 0.142 = 0.113).

> contrast(em, method = "trt.vs.ctrl")
resp = 1:
 contrast     estimate     SE df t.ratio p.value
 trt2 - trt1  0.113130 0.0324 12   3.492  0.0121
 trt3 - trt1  0.064320 0.0326 12   1.974  0.1743
 trt4 - trt1  0.070792 0.0317 12   2.231  0.1142
.
.
.

This is not an issue when I set up contrasts for a logistic regression model made with glm, as it reports an odds.ratio column. In order to get results in terms of odds ratios, is the following approach what you would recommend?

> em2 <- emmeans(mod, ~ trt | resp, mode = "prob")
> em2 <- regrid(em2, transform = "logit")
> contrast(em2, method = "trt.vs.ctrl", type = "response")
resp = 1:
 contrast    odds.ratio     SE df null t.ratio p.value
 trt2 / trt1      2.068 0.4485 12    1   3.348  0.0157
 trt3 / trt1      1.570 0.3620 12    1   1.955  0.1796
 trt4 / trt1      1.632 0.3658 12    1   2.186  0.1232
.
.
.
P value adjustment: dunnettx method for 3 tests 
Tests are performed on the log odds ratio scale

NOTE: if resp was only two columns, I could perform logistic regression with glm(..., family = binomial) and compare it to the multinom equivalent. The results are identical if I use the regrid approach demonstrated above, aside from the degrees of freedom, confidence intervals, and associated p-values (CIs are much more narrow for multinom models), which I assume has some theoretical underpinnings that I am yet unfamiliar with.

rvlenth commented 10 months ago

Yes. Regridding to the logit scale is exactly what I'd recommend if you want odds ratios.

Russ

rvlenth commented 9 months ago

Closing this issue as completed

rvlenth / emmeans

Is there support for `nnet::multinom` when the LHS of the formula is a matrix of counts? #439