rvlenth / emmeans

Estimated marginal means
https://rvlenth.github.io/emmeans/
364 stars 32 forks source link

R package emmeans: Estimated marginal means

Website

https://rvlenth.github.io/emmeans/

Features

Estimated marginal means (EMMs, also known as least-squares means in the context of traditional regression models) are derived by using a model to make predictions over a regular grid of predictor combinations (called a reference grid). These predictions may possibly be averaged (typically with equal weights) over one or more of the predictors. Such marginally-averaged predictions are useful for describing the results of fitting a model, particularly in presenting the effects of factors. The emmeans package can easily produce these results, as well as various graphs of them (interaction-style plots and side-by-side intervals).

Model support

Versions and installation

remotes::install_github("rvlenth/emmeans", dependencies = TRUE, build_vignettes = TRUE)

Omitting the build_vignettes argument can save some time if you don't want the vignettes. They can always be found for the latest CRAN version or -- perhaps more up-to-date -- the emmeans site.

Note:

For the latest release notes on this development version, see the NEWS file

Rounding

For its summary output, emmeans uses an optimal-digits algorithm that rounds results to about the number of digits that are useful, relative to estimates' confidence limits. This avoids cluttering the output, but it is unlike other R results, which are typically less round. If this is annoying to you, there is an option (opt.digits = FALSE) that disables the optimal-digits routine.

"Tidiness" can be dangerous

I see more and more users who are in a terrible hurry to get results. They develop a "workflow" where they plan-out several steps at once and pipe them together. That's useful when you don't have to think about what happens in those steps; but when you're doing statistics, you should be thinking! Most functions in the emmeans package yield results that are accompanied by annotations such as transformations involved, P-value adjustments made, the families for those adjustments, etc. If you just pipe the results into some more code, you never see those annotations.

Please slow down! Look at the actual results from each emmeans package function without any post-processing -- None. That way, you'll see the annotated summaries. Statistics is pretty hard stuff. Don't make it harder by blindfolding yourself.

Supersession plan

The developer of emmeans continues to maintain and occasionally add new features. However, none of us is immortal; and neither is software. I have thought of trying to find a co-maintainer who could carry the ball once I am gone or lose interest, but the flip side of that is that the codebase is not getting less messy as time goes on -- why impose that on someone else? So my thought now is that if at some point, enough active R developers want the capabilities of emmeans but I am no longer in the picture, they should feel free to supersede it with some other package that does it better. All of the code is publicly available on GitHub, so just take what is useful and replace what is not.

Note: emmeans supersedes the package lsmeans. The latter is just a front end for emmeans, and in fact, the lsmeans() function itself is part of emmeans.