Open marcdotson opened 2 years ago
@marcdotson Do you want us to put links to relevant resources in this issue?
Putting these here for now:
This paper provides a lot of the mathematics behind hierarchical Bayesian conjoint, along with a bit of commentary on sample sizes and interactions. It doesn't directly answer any of our questions, but it may be useful to refer to later. https://webuser.bus.umich.edu/plenk/HB%20Conjoint%20Lenk%20DeSarbo%20Green%20Young%20MS%201996.pdf
While I was initially a bit skeptical about this paper due to the journal it's in, it actually contains a very nice bit on interaction terms in logistic regressions. https://www.jstor.org/stable/353415?seq=10
This paper looks sufficient to answer question 1. It was referenced in the above paper, and has good balance of math and commentary. I haven't given it a full read, and I want to keep looking for resources for the other questions, but be sure to check this out. The only caveat is that it isn't explicitly Bayesian. https://psycnet.apa.org/record/1992-03973-001
Almost-helpful slideshow for Q2. http://www.stat.columbia.edu/~gelman/presentations/interactions.pdf
A brief mention of how care should be taken setting priors for interactions. https://gspp.berkeley.edu/assets/uploads/research/pdf/Hierarchical_Models_for_Causal_Effects.pdf
Some context from Andrew on Q3:
Yes! The general idea is difference-in-differences. This full chapter here explains it https://theeffectbook.net/ch-DifferenceinDifference.html and I have a video lecture about it https://www.youtube.com/watch?v=0v1aE70FhsQ&list=PLS6tnpTr39sHydbEoTK9DkyKV92-uE3r-&index=4 (see this for a bunch of other resources too: https://evalsp23.classes.andrewheiss.com/content/08-content.html and https://evalsp23.classes.andrewheiss.com/example/diff-in-diff.html)
Regarding specifications of the mnl model per the GLM article mentioned above, there is this one:
P(Yi=1|X)=(1+exp[−β0−β1Xi1−β2Xi2])^−1 or P(Yi=1|X)=(exp[β0+β1Xi1+β2Xi2])/(1+exp[+β0+β1Xi1+β2Xi2])
and this one:
log([P(Yi=1|X)/P(Yi=0|X)])=β0+β1Xi1+β2Xi2+β3Xi1Xi2
Just to be sure--we are currently specifying the model in the second way, right?
Also, I found this blog post for including intercepts in mnl stan models, it might be useful.
https://eleafeit.com/posts/2021-05-23-parameterization-of-multinomial-logit-models-in-stan/
@marcdotson did the index-coded flat model ever finish? if so, are the draws available?
Yes, the second one where there is no logit link function on the linear model.
It's a good blog post. It's tangentially relevant, though don't get caught up with effects coding. I reference it a bit in my blog post on contrast equivalence.
I didn't save the draws out since it never ran without divergent transitions, but the full model run is in the shared drive as The draws are saved in the shared drive as mnl_index.rds
.mnl_index.rds
, but it never ran without divergent transitions. It's wrong. Maybe the question is how wrong. You can compare it to the mnl
draws. I was going to look at specifying the outside good using a separate indicator like we discussed. I just haven't done it yet.
Got it, I'll check it out. While I was poking around for GLM stuff, I came across QR reparameterization as a possible solution to help. It took me a couple days to figure it out, but I think I got it. I had to do some work in R to get the data set up, as well as creating a new Stan file and reconfiguring it. All of the changes in R are confined to a chunk at the end of 03_analysis.Rmd
, and nothing else has changed. I'm currently rerunning it for the dummy-coded model, as I forgot to save the draws when I ran it yesterday. I'm not sure if you've worked with QR reparameterization before, but it's somewhat straightforward. I have links to the Stan User guide and Michael Betancourt's blog on the subject below:
https://mc-stan.org/docs/stan-users-guide/QR-reparameterization.html https://betanalpha.github.io/assets/case_studies/qr_regression.html#1_setting_up_the_rstan_environment
I'll push the update when the model finishes so you can check it out.
I'm not sure how the QR parameterization helps here. It would help with efficiency, but not with unidentifiability.
@wtrumanrose don't use those index-coded draws. That model is clearly unidentified. I'm trying something else.
But the HMNL and MNL draws are confirmed to be equivalent. More than the QR parameterization (note above), we need to investigate the implementation of the paper for link-less interactions.
HMNL
MNL
@wtrumanrose index-coding is a wash. I got it to work without divergent transitions. It needed a more informative prior. I tried it with the outside good coded as it's own dummy as well as what we've been doing, with it all zeroes and thus an implied indicator. Neither work. Everything is shrunk towards the prior such that there's won't be any differences in contrasts.
As a reference, here are the marginal posteriors for the index-coded models: dummy-coded outside good and zero-coded outside good, respectively (the labels probably aren't right, but it doesn't matter).
@marcdotson Okay, I did indeed run an aggregated contrast model for all the hypotheses. However, a couple of the hypotheses involving government relations had really weird posteriors, as seen below.
This is what the other posteriors look like when I remove the problematic ones.
I'm going back through the code to see if/where I made an error, I'll push it later today regardless of if I find anything though. If I can figure out what is going on, I'll try rerunning the model overnight and hopefully have it ready for your meeting tomorrow.
@wtrumanrose how does this compare to the contrasts we get from the dummy-coded model where contrasts are computed post-hoc?
@marcdotson I'm currently running an aggregate-coded model for just the main effects and no interactions; these are the main effects for the aggregate:
and these are the dummy coded:
I did a silly thing and forgot to include the hypothesis about accountability, which is a main effect as well. H4a is also flipped--I switched the reference codes by accident, but its an easy fix.
I'll update the issue once the aggregate-coded main effects finish running. I don't think I'll have the code ready to push tonight, as I'm still working on making it fit in with the current code, but I can push the finished graphs and/or post them here. Also, we never ran a flat dummy coded model with all the interactions, did we?
Difference in marginal means provides a more interpretable (and possibly inferentially equivalent?) way to demonstrate hypothesis contrasts. See Andrew's work here.
Main effects and filter the data by interactions. Have we done this? Filter prior to estimation or prediction, dummy coding vs. index coding. Fix the irrelevant attributes?
Especially with financial transparency and accountability in crackdown vs. not during crackdown.
Start with marginals on their own. Look at running with single interactions and main effects.