stan-dev / rstanarm

rstanarm R package for Bayesian applied regression modeling
https://mc-stan.org/rstanarm
GNU General Public License v3.0
389 stars 134 forks source link

add support for multinomial logit models #20

Open jgabry opened 9 years ago

dahtah commented 8 years ago

You can turn multinomial logit models into Poisson count models, you just have to be careful with the intercept. That's how INLA does it (http://www.r-inla.org/faq). The classical way of justifying it is to show the ML estimates are equivalent, see: Baker, S. G. (1994). The Multinomial-Poisson transformation. Journal of the Royal Statistical Society. Series D (The Statistician), 43(4):495-504. Barthelme, S, Chopin, N (2015) The Poisson Transform for Unnormalised Statistical Models. Statistics and Computing http://arxiv.org/abs/1406.2839 In a Bayesian context you should ideally have a improper flat prior on the intercept (in which case the models are strictly equivalent) but a vague enough prior should do, provided the covariates are scaled appropriately.

jgabry commented 8 years ago

These are great references. Thank you. I've done this sort of thing with multinomial logit models before, but it's been a while and I hadn't thought about it for rstanarm. Definitely worth looking into.

bgoodri commented 8 years ago

I think count.stan already almost has what we would need for the multinomial-Poisson trick. We would just need a stan_multinom() R function to collapse the data and call it. Also, we could go beyond nnet::multinom() by allowing lme4-style formulas.

rtrangucci commented 8 years ago

@jgabry is there a branch anywhere that does this estimation yet?

bgoodri commented 8 years ago

No. We have a mnp branch for multinomial probit.

On Nov 11, 2016 4:56 PM, "Rob Trangucci" notifications@github.com wrote:

@jgabry https://github.com/jgabry is there a branch anywhere that does this estimation yet?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/20#issuecomment-260064715, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqlMeX2c-OOiNdKqk9NX4cSTOL-eIks5q9ORvgaJpZM4F1-KN .

vnijs commented 7 years ago

@bgoodri I can see the branch but I don't see any examples on how run the MNP. Did I miss it?

imadmali commented 7 years ago

@vnijs it looks like stan_mnp() function in the feature/mnp branch is emulating the MNP::mnp() function so maybe you can try one of the examples there (see ?MNP)

bgoodri commented 7 years ago

It doesn't really work yet, but it is a multinomial probit model as opposed to multinomial logit.

On Wed, Apr 12, 2017 at 11:32 PM, Imad Ali notifications@github.com wrote:

@vnijs https://github.com/vnijs it looks like stan_mnp() function in the feature/mnp branch is emulating the MNP::mnp() function so maybe you can try one of the examples there (see ?MNP)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/20#issuecomment-293769844, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqqMa6S0PFPVPMUdSmb4R0aLgrQjpks5rvZdbgaJpZM4F1-KN .

vnijs commented 7 years ago

Thanks for the comment @imadmali. I'd prefer MNL but will try out MNP. I doesn't seem to have an option to specify a hierarchical MNP right?

@bgoodri Any idea if/when (hierarchical) MNL is likely to be added to rstanarm?

bgoodri commented 7 years ago

It isn't really planned, but it is probably not too difficult for someone to implement if you just mean something like stan_glmer for a categorical outcome.

On Thu, Apr 13, 2017 at 1:16 AM, Vincent Nijs notifications@github.com wrote:

Thanks for the comment @imadmali https://github.com/imadmali. I'd prefer MNL but will try out MNP. I doesn't seem to have an option to specify a hierarchical MNP right?

@bgoodri https://github.com/bgoodri Any idea if/when (hierarchical) MNL is likely to be added to rstanarm?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/20#issuecomment-293783793, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqoUlPVurOccDCmLadPPg6av9dF-jks5rva-ggaJpZM4F1-KN .

jgabry commented 7 years ago

Yeah given that multinomial logit is much simpler to do than probit (I.e., unlike the binomial versions, they're not just the same model with different link), then maybe we should just go ahead and get the multinomial logit implemented.

@bgoodri if we were to add this do you think it makes sense to do it directly or via the relationship with poison?

On Thu, Apr 13, 2017 at 1:28 AM bgoodri notifications@github.com wrote:

It isn't really planned, but it is probably not too difficult for someone to implement if you just mean something like stan_glmer for a categorical outcome.

On Thu, Apr 13, 2017 at 1:16 AM, Vincent Nijs notifications@github.com wrote:

Thanks for the comment @imadmali https://github.com/imadmali. I'd prefer MNL but will try out MNP. I doesn't seem to have an option to specify a hierarchical MNP right?

@bgoodri https://github.com/bgoodri Any idea if/when (hierarchical) MNL is likely to be added to rstanarm?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/20#issuecomment-293783793, or mute the thread < https://github.com/notifications/unsubscribe-auth/ADOrqoUlPVurOccDCmLadPPg6av9dF-jks5rva-ggaJpZM4F1-KN

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/20#issuecomment-293785625, or mute the thread https://github.com/notifications/unsubscribe-auth/AHb4Q7AdYkHsz9h2t2XXwOT2-RLqCiSGks5rvbKWgaJpZM4F1-KN .

bachlaw commented 6 years ago

Unclear if people are still planning to use the multinomial-Poisson trick to put the MNL, but is that practically doable with modeled group coefficients? It is straightforward to make the output of a multinomial a count and recast the predictors as a set of interactions between the unmodeled group coefficients and each multinomial category. But it strikes me as a much nastier business to set up similar interactions with modeled group variables --- assuming that would need to be part of the process. Unclear if we would end up with a series of varying slopes or some other parameterization entirely. Does anyone see this as a barrier?

flaxter commented 5 years ago

Hi, just wondering what the current status is of this. (I don't have a strong feeling about logit vs probit, multinomial-Poisson, etc, just need to try something out for the moment.) Thanks!

TanyaMurphy commented 5 years ago

@paul-buerkner 's https://github.com/paul-buerkner/brms can be used for multinomial logistic. What is the difference between the Stan model that it generates and what you envision for MNL in rstanarm?

bgoodri commented 5 years ago

Don't have anything written. But brms will do multilogit I'm pretty sure.

On Tue, Dec 11, 2018 at 11:21 AM Seth Flaxman notifications@github.com wrote:

Hi, just wondering what the current status is of this. (I don't have a strong feeling about logit vs probit, multinomial-Poisson, etc, just need to try something out for the moment.) Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/20#issuecomment-446264089, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqlWz5pfWIR-u5GXtjrHi9GwiewK7ks5u39tygaJpZM4F1-KN .

bgoodri commented 5 years ago

For MNL, probably not much difference.

On Tue, Dec 11, 2018 at 11:36 AM Tanya Murphy notifications@github.com wrote:

@paul-buerkner https://github.com/paul-buerkner 's https://github.com/paul-buerkner/brms can be used for multinomial logistic. What is the difference between the Stan model that it generates and what you envision for MNL in rstanarm?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/20#issuecomment-446269997, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqgB88t5EryRyPXV5Au2d1SQ8Dj5Lks5u398DgaJpZM4F1-KN .

bachlaw commented 5 years ago

The major difference I usually see is that brms seems to use a non-centered parameterization by default in most implementations, at least when it is a multilevel model, but I don’t know how much of a benefit that is with the multinomial.

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows 10


From: bgoodri notifications@github.com Sent: Tuesday, December 11, 2018 9:37:53 AM To: stan-dev/rstanarm Cc: bachlaw; Comment Subject: Re: [stan-dev/rstanarm] add support for multinomial logit models (#20)

For MNL, probably not much difference.

On Tue, Dec 11, 2018 at 11:36 AM Tanya Murphy notifications@github.com wrote:

@paul-buerkner https://github.com/paul-buerkner 's https://github.com/paul-buerkner/brms can be used for multinomial logistic. What is the difference between the Stan model that it generates and what you envision for MNL in rstanarm?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstanarm/issues/20#issuecomment-446269997, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqgB88t5EryRyPXV5Au2d1SQ8Dj5Lks5u398DgaJpZM4F1-KN .

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/stan-dev/rstanarm/issues/20#issuecomment-446270525, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALDpUCSgq4oDoclOEnE_70rstKzt87lpks5u399hgaJpZM4F1-KN.

flaxter commented 5 years ago

It seems that brms supports categorical, but not multinomial. For my setting (a half-dozen categorical covariates), there's a significant speedup from being able to aggregate to counts---i.e. the logistic model I ran with just two categories in RStanArm was way faster than the equivalent model without aggregation. But maybe I'm missing something about brms's capabilities?

bachlaw commented 5 years ago

Agreed. The categorical takes a LONG time to sample. But I also recall Paul B saying that something about the categorical implementation was fairly speedy all things considered, which I assumed was an implicit reference to a multinomial alternative.

Still your point is a good one: a pre-coded aggregated option from somebody would be great to have. Unfortunately I suspect there is already a long list of requested features for folks to work on.

Sent from my iPhone

On Dec 12, 2018, at 8:25 AM, Seth Flaxman notifications@github.com<mailto:notifications@github.com> wrote:

It seems that brms supports categorical, but not multinomial. For my setting (a half-dozen categorical covariates), there's a significant speedup from being able to aggregate to counts---i.e. the logistic model I ran with just two categories in RStanArm was way faster than the equivalent model without aggregation. But maybe I'm missing something about brms's capabilities?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/stan-dev/rstanarm/issues/20#issuecomment-446605323, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALDpUD7NZBN7TWwxAIBiCbKVThxKeiXqks5u4RHRgaJpZM4F1-KN.

TanyaMurphy commented 5 years ago

Yes, I meant categorical. I guess it is a special case of multinomial when number of draws = 1? (BUGS had separate categorical and multinomial distributions.) I (naively) tried modelling repeated measures of body mass index (BMI) as 4 categories (no order imposed) and as continuous (linear reg) - same predictors. The multinomial logistic model took many times longer, but, as @bachlaw alludes, R's freq. mlogit() function is very slow compared to lmer(), too. I'm too much of a newbie to know whether I could have optimized things better.