ShanaScogin / BayesPostEst

An R package to generate and plot postestimation quantities after estimating Bayesian regression models using MCMC
https://shanascogin.github.io/BayesPostEst/
GNU General Public License v3.0
12 stars 2 forks source link

long-term goals: mcmcAveProb, mcmcObsProb, mcmcFD for multilevel models #28

Open jkarreth opened 5 years ago

jkarreth commented 5 years ago

write function to easily incorporate level-2 groupings

ShanaScogin commented 4 years ago

@jkarreth are there good readings on this you'd recommend? I've been wondering how to do the OV method in particular for logit mlms (or even if the OV method is possible for a population average, etc)

jkarreth commented 4 years ago

The intuition for the OV approach is the same as what we do so far, except that we need to make predictions for groups and then average them.

There are a few descriptions on how to do this online. You might have already found them:

  1. UCLA R Data Analysis Examples: Mixed effects logistic regression -> section on "Predicted probabilities and graphing"
  2. Skrondal & Rabe-Hesketh 2009 (link)
  3. Pavlou et al. 2015 (link)

We can also take a look at the code for first difference that I had in the workshop (Day 16), as it gets at this to a degree. In that code, I generated average cases in each country.

ShanaScogin commented 3 years ago

Right now the required objects for mcmcObsProb and mcmcPredProb are pretty clunky. If I remember correctly, there was an issue with changing this, however, since we don't know how the model object will have the variables named. Just a thought as I'm looking at this again, but I wonder if we can get the required arguments to be more intuitive as we update these for mlms.

jkarreth commented 3 years ago

Thinking out loud for a second: if I compare this to more convenient functions for similar quantities (e.g. the effects, margins, or easystats packages), those packages all process model objects. It's easy to refer to specific parameters if the model object is known in advance.

But because we work with simply a matrix of posterior draws, we don't have this advantage. And I think we want to keep the package as general as possible, i.e. not depend on the user using rstanarm or brms or MCMCpack.

I wonder if we can somehow work in a step (under the hood, in the predicted probability function itself) that names the columns of the posterior draws matrix based on the model equation.

For rstanarm objects and brms model objects, that's easy.

For JAGS and Stan less so.

I'll keep thinking about this. You're definitely right that this is a major challenge to implementing this for MLMs, because we want to distinguish b/w "fixed effects" (estimates for regression predictors across all groups) and varying intercepts/slops.

ShanaScogin commented 3 years ago

That makes perfect sense - I love the idea of including the model equation. I wonder - should I start on a function for rstanarm/brms/MCMCpack and if something comes to you for jags, etc we can fold it in?

jkarreth commented 3 years ago

should I start on a function for rstanarm/brms

I think that would be cool (if you have bandwidth to do it). Since rstanarm & brms are the most frequently used tools today, I think it might make most sense to focus on them. Both have the model formula as part of the model object, so working along the effects, margins, or easystats examples might offer some way forward?

I'm tied up with other things, would otherwise be happy to help now - but will probably have to wait with contributing.

jkarreth commented 3 years ago

Somewhat relatedly, once we get to the MLM part, I think it'll be good to revisit your initial question at the beginning of this thread (perhaps via Pavlou et al. 2015 (link)). I just had that question come up in the discussion of a paper, so thinking through it should be beneficial.

ShanaScogin commented 3 years ago

Great! I'll take a look at effects, margins, and easystats for sure - that sounds perfect. I am not 100% sure about bandwidth, but I have been wanting to use mcmcObsProb for one of my mlm side projects, so I am hoping to work on both at the same time. (The project got sidelined while I reworked my dissertation bc of covid, but I think I'm getting back into the game.) I'm not going to be moving quickly at any rate, so I'll keep you updated! Thanks for all the materials!

jkarreth commented 3 years ago

Sounds good! If needed, and as a quick fix perhaps, what I've done so far is the approach that Pavlou et al. 2015 (link) critique: set all "random effects" (varying intercepts) to 0 and build predictions just based on the "fixed effects", ignoring the between-group differences.

ShanaScogin commented 3 years ago

This reading is super great - thanks!

edit: Ok, so I am still thinking options might be good for the two ways in Pavlou et al 2015. (Also found this, which seems great: https://www.statalist.org/forums/forum/general-stata-discussion/general/1304704-cannot-estimate-marginal-effect-after-xtlogit)

edit2: saving this as something to refer to: https://www.rdocumentation.org/packages/glmmTMB/versions/1.0.2.1/topics/predict.glmmTMB

ShanaScogin commented 3 years ago

Saving here for later reference: https://www.cambridge.org/core/journals/political-analysis/article/understanding-choosing-and-unifying-multilevel-and-fixed-effect-approaches/8101D49CFD3B129F5753FC878F416980

jkarreth commented 2 years ago

This great blog post by @andrewheiss does provide some shortcuts to some (but not all) of what we had in mind here.