create statistical summaries of multiverses (e.g. a Bayesian hierarchical model, spec curve p value)

mjskay commented 5 years ago

Some options:

[ ] Gelman comment about using a Bayesian multilevel model for this. Sketch of idea might be: take estimates and SEs from multiverses and use a meta-analytic model. Question: how does the structure of the multiverse itself impact this model? Different results are themselves related to each other by the branching structure of the multiverse, should this be reflected in the model? @ntaback thoughts?
[ ] specification curve paper had a p value approach
[ ] "envelope" p-box could be used to summarize posterior distributions: https://hal.inria.fr/hal-01518666
[ ] whatever is going on in this paper: https://journals.sagepub.com/doi/10.1177/0049124115610347
[ ] jelveh2018 is doing a bunch of weird things, could look at it

To be resolved: how does this square philosophically with the whole idea of a multiverse in the first place? Need to be very careful about this. If a multiverse is about acknowledging ontological uncertainty (and about having conversations about it through the literature), how does reducing it back down to a single estimate (or p value) square with that? @dragice thoughts?

ntaback commented 5 years ago

This is from the Steegen et al. paper (pg. 710). As part of one of the vignettes we can average the p-values for different parameters and maybe add a confidence interval to the average.

The multiverse analysis does not produce a single value summarizing the eviden- tial value of the data, nor does it imply a threshold for an effect to reach to be declared robustly significant. Never- theless, one might try to summarize the multiverse analy- sis more formally. One reasonable first step is to simply average the p values in the multiverse, in this case aver- aging all the numbers displayed in Figure 1 or 2. This mean value can be considered as the p value of a hypo- thetical preregistered study with conditions chosen at random among the possibilities in the multiverse and seems like a fair measurement in a setting where all of the possible data processing choices seem plausible (as in the example presented here, where the different options are drawn from other papers in the relevant literature).

ntaback commented 5 years ago

The quote about hierarchical models is (same page as above):

In a more complete analysis, the multiverse of data sets could be crossed with the multiverse of models to further reveal the multiverse of statistical results ... this motivates encompassing analyses of multiple predictors, interactions, or outcomes in a hierarchical model so as to reduce problems of mul- tiple comparisons (Gelman, Hill, & Yajima, 2012).

Constructing this type of vignette could be interesting.

mjskay commented 5 years ago

I'm adding p-boxes to this list: bounds on a CDF. Perhaps the "envelope" method could be used to summarize posterior distributions from Bayesian models, for example (see this paper: https://hal.inria.fr/hal-01518666 ).

mjskay commented 5 years ago

And another computational approach to multiverse summaries I'm adding to the list: https://journals.sagepub.com/doi/10.1177/0049124115610347

(NB I haven't read this yet)

mjskay commented 4 years ago

Also if we're thinking about model comparison need to be careful that we use valid metrics. My understanding is you'd have to limit comparison to models fit on the same data (so no multiverses with outlier removal), then you'd want a metric that allows valid comparison if variables have been transformed (like scaled CRPS: https://t.co/JfOFMVi8Pz)

MUCollective / multiverse

create statistical summaries of multiverses (e.g. a Bayesian hierarchical model, spec curve p value) #46