rvlenth / emmeans

Estimated marginal means
https://rvlenth.github.io/emmeans/
341 stars 30 forks source link

Emmeans for multiple imputed multinom models #405

Closed samuelsaari closed 1 year ago

samuelsaari commented 1 year ago

It is possible to get predicted values for models done with:

  1. base lm function
  2. multiple imputed results from lm function
  3. nnet's multinom function
  4. But it seems like a support for multiple imputed nnet multinom is not supported.

If this is the case, would like to make a feature request. If this is possible, would love to hear how to do that.

Below some examples to make the point clearer

library(emmeans)
library(lavaan)
library(mice)
library(tidyverse)
library(nnet)

imp_data <- mice(nhanes,m=3,maxit=1) # adjust methods

at_age <- list(age=1:3)
# 2 lm with imputations
(m2 <- with(imp_data, lm(bmi ~age )))
(e2 <- emmeans(m2,~age,at=at_age))

# 3 multinom with original data
(m3 <- multinom(bmi ~age, data = nhanes))
(e3 <- emmeans(m3,~age,at=at_age))

#4  multinom with imputations
(m4 <- with(imp_data, multinom(bmi ~age)))
(e4 <- emmeans(m4,~age,at=at_age))
class(m4)
# [1] "mira"   "matrix"

(1), 2 and 3 work just fine, but e4 produces the error:

Error in UseMethod("recover_data") : 
  no applicable method for 'recover_data' applied to an object of class "c('multinom', 'nnet')"
Error in (function (object, at, cov.reduce = mean, cov.keep = get_emm_option("cov.keep"),  : 
  Perhaps a 'data' or 'params' argument is needed

My guess is that while mira objectets are supported, emmeans would not be able to read them, if the underlying model is nnet's multinom.

But happy to hear if I am just missing something. In any case, this would be a nice feature!

rvlenth commented 1 year ago

Hmmm, well it works for me, after a fashion:

> (e4 <- emmeans(m4,~age,at=at_age))
 age   prob            SE df lower.CL upper.CL
   1 0.0625 1.9952581e-10 30   0.0625   0.0625
   2 0.0625           NaN 30      NaN      NaN
   3 0.0625           NaN 30      NaN      NaN

Results are averaged over the levels of: bmi 
Confidence level used: 0.95 
Warning messages:
1: In .qf.non0(object@V, x) : Negative variance estimate obtained!
2: In .qf.non0(object@V, x) : Negative variance estimate obtained!

I get similar warnings with m3 as well, because the estimates are so ill-conditioned.

Anyway, right now I don't see that there's an issue to address. I think I have to leave the code that produces the above warnings in place, as we did obtain negative variance estimates, even though they are basically zero.

rvlenth commented 1 year ago

Am closing this, as I have not seen a reason yo keep it open.

samuelsaari commented 1 year ago

This seems to be due to the fact that the HPC singulary container where I use R(studio) has an older version of R and emmeans and cannot be updated to a later version than 1-7-5 before an update on the server. Could make this work on Rstudio cloud - and on my personal computer after reinstalling everything R related.

Will return to this either if needed but seems that this was an update issue.

samuelsaari commented 11 months ago

UPDATE: Now that the singularity container has been updated to R 4.3.0 (emmeans version 1.8.6), that MVE above works.

However my actual code does not work (same error message as above), unless I run the MVE above first! Altough I cannot locate exactly why the problem arises, it seems to do with commands of the same name from different packages. Loading the packages in different order or using the conflicted package to make sure that there are no command conflictsin my own code did not help.

My hypothesis is that somehow running those emmeans commands in the MVE first before running my own code will make R take priority for the dependency packages that emmeans is using under the hood.

Cannot reproduce the issue here as that would involve hundreds of lines of code. Would still appreciate to hear your best guesses why this might be happening and checking that emmeans excplicitly uses commands from specific packages if there are any dependencies [ eg. if using filter from dplyr, is it decleared as dplyr::filter as opposed to just 'filter' as stats::filter is also availabel]. The conflicted package or a similar workflow might be of great help here, too.

rvlenth commented 11 months ago

FYI, I am traveling, and it will be some time before I can look at this.

samuelsaari commented 11 months ago

OK, cool.

For your reference, here are the packages I load and masked commands. Hope it will be of help to you.

I suspect that the issue is either with dplyr::filter, purrr::map or an object test in emmeans as this is shown after loading mice and devtools:

The following object is masked from ‘package:emmeans’:

    test

Here are are the packages loaded:


> lapply(libraries_cran, require, character = TRUE)
Loading required package: remotes
Loading required package: ftable
Loading required package: fastDummies
Loading required package: table1

Attaching package: ‘table1’

The following objects are masked from ‘package:base’:

    units, units<-

Loading required package: latex2exp
Loading required package: interactions
Loading required package: TraMineR

TraMineR stable version 2.2-7 (Built: 2023-06-14)
Website: http://traminer.unige.ch
Please type 'citation("TraMineR")' for citation information.

Loading required package: marginaleffects
Loading required package: ggpattern
Loading required package: semTools

###############################################################################
This is semTools 0.5-6
All users of R (or SEM) are invited to submit functions or ideas for functions.
###############################################################################

Attaching package: ‘semTools’

The following object is masked from ‘package:readr’:

    clipboard

Loading required package: semTable
Loading required package: DiagrammeRsvg
Loading required package: rsvg
Linking to librsvg 2.42.7
Loading required package: labelled
Loading required package: psych

Attaching package: ‘psych’

The following objects are masked from ‘package:semTools’:

    reliability, skew

The following objects are masked from ‘package:ggplot2’:

    %+%, alpha

The following object is masked from ‘package:lavaan’:

    cor2cov

Loading required package: psychTools
Loading required package: corrplot
corrplot 0.92 loaded
Loading required package: xtable

Attaching package: ‘xtable’

The following objects are masked from ‘package:table1’:

    label, label<-

Loading required package: doParallel
Loading required package: foreach

Attaching package: ‘foreach’

The following objects are masked from ‘package:purrr’:

    accumulate, when

Loading required package: iterators
Loading required package: parallel
Loading required package: paletteer
Loading required package: extrafont
Registering fonts with R
Loading required package: ggpubr
Loading required package: grid
Loading required package: OpenRepGrid
------------------------------------------------
 OpenRepGrid Version 0.1.12
 Tools for the analysis of repertory grid data
 For an introduction visit: www.openrepgrid.org
 CAUTION: The package is in alpha phase.
          Design changes may still occur.
------------------------------------------------

Attaching package: ‘OpenRepGrid’

The following object is masked from ‘package:psych’:

    distance

The following object is masked from ‘package:purrr’:

    map

Loading required package: car
Loading required package: carData

Attaching package: ‘car’

The following object is masked from ‘package:psych’:

    logit

The following object is masked from ‘package:dplyr’:

    recode

The following object is masked from ‘package:purrr’:

    some

Loading required package: haven
Loading required package: devtools
Loading required package: usethis

Attaching package: ‘usethis’

The following object is masked from ‘package:remotes’:

    git_credentials

Attaching package: ‘devtools’

The following objects are masked from ‘package:remotes’:

    dev_package_deps, install_bioc, install_bitbucket,
    install_cran, install_deps, install_dev, install_git,
    install_github, install_gitlab, install_local,
    install_svn, install_url, install_version,
    update_packages

The following object is masked from ‘package:emmeans’:

    test

Loading required package: flexmix
Loading required package: lattice
Loading required package: tictoc

Attaching package: ‘tictoc’

The following object is masked from ‘package:OpenRepGrid’:

    shift

Loading required package: furrr
Loading required package: future

Attaching package: ‘future’

The following object is masked from ‘package:OpenRepGrid’:

    cluster

Loading required package: snow

Attaching package: ‘snow’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall,
    clusterEvalQ, clusterExport, clusterMap, clusterSplit,
    makeCluster, parApply, parCapply, parLapply, parRapply,
    parSapply, splitIndices, stopCluster

Loading required package: doSNOW
Loading required package: tidyLPA
You can use the function citation('tidyLPA') to create a citation for the use of {tidyLPA}.
Mplus is not installed. Use only package = 'mclust' when calling estimate_profiles().
Loading required package: khroma
Loading required package: glue
Loading required package: microbenchmark
Loading required package: jtools

Attaching package: ‘jtools’

The following object is masked from ‘package:tidyLPA’:

    get_data

The following object is masked from ‘package:OpenRepGrid’:

    center

The following objects are masked from ‘package:interactions’:

    cat_plot, interact_plot, johnson_neyman,
    probe_interaction, sim_slopes

Loading required package: showtext
Loading required package: sysfonts
Loading required package: showtextdb

Attaching package: ‘showtextdb’

The following object is masked from ‘package:extrafont’:

    font_install

Loading required package: kableExtra

Attaching package: ‘kableExtra’

The following object is masked from ‘package:dplyr’:

    group_rows

Loading required package: lavaanPlot
Loading required package: miceadds
* miceadds 3.13-12 (2022-05-30 15:14:07)
Loading required package: mitools
Loading required package: conflicted
rvlenth commented 11 months ago

This is too complicated for me to understand. But I think it all reduces to the usual need to have compatible versions of all packages involved in the desired computations. Make sure that you have updated all the following packages:

> tools::package_dependencies("emmeans", recursive = TRUE)
$emmeans
[1] "estimability" "graphics"     "methods"      "numDeriv"     "stats"        "utils"       
[7] "mvtnorm" 
samuelsaari commented 11 months ago

Yes, I know it is complicated.

I boils down to: A) The MVE that this thread starts with works know so in a way the issue is fixed B) I cannot make the code that I use for my research to work. It should be just a more complicated version of the MVE above: after some data wrangling a nnet::multinom and then predicting values with emmeans::emmeans C) However, if I run the MVE above and only after that my research code, everything works magically.

In my previous post, I have tried to hypothesize why C works and not B, but cannot be sure.

The dependencies should be up-to date, but with a singularity container of a supercomputer that I am using, updating packages reliably is sometimes tricky.

Unless you know where to take it from here, the issue can be closed for now and will get back if I gain any further insights.

rvlenth commented 11 months ago

What does "MVE" mean?

samuelsaari commented 11 months ago

Sorry. MWE, minimal working example or the code snippet I have in the beginning of this thread. https://en.wikipedia.org/wiki/Minimal_reproducible_example https://stackoverflow.com/help/minimal-reproducible-example