pymc-devs / pymc-examples

Examples of PyMC models, including a library of Jupyter notebooks.
https://www.pymc.io/projects/examples/en/latest/
MIT License
259 stars 212 forks source link

Discrete choice #544

Closed NathanielF closed 12 months ago

NathanielF commented 1 year ago

Discrete Choice Modelling

Working out some of the details of how to represent the various incarnations of the discrete choice models in PyMC building to the hierarchical or random coefficients logit described for instance in Jim Savage's work in Stan here: https://khakieconomics.github.io/2019/03/17/Logit-models-of-discrete-choice.html. They are quite distinct from reinforcement learning strategies and are used to estimate aggregate demand and supply curves in differentiated goods markets and more individually looking at the substitution pattern customers have between goods.

This is a draft primarily because i need to find an efficient representation of the models in PyMC and write up more detail on motivating the models.

Helpful links

review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

NathanielF commented 1 year ago

Marking this one as ready for review @twiecki and @ricardoV94.

I've found a representation of the discrete choice models that i'm pretty happy with. I'm demonstrating them on two different but canonical data sets.

(1) On the choice of heating systems, where i focus just on specifying the utility matrix for the individual alternatives (2) On the choice of crackers with repeated decisions over the same decision maker.

In (1) i demonstrate how you can add alternative specific parameters e.g. intercepts and beta parameters for income on the specific alternative.

In (2) i focus on how you can add person specific modifications of utility and how you can use prior constraints in the Bayesian context to ensure that the parameter estimates "make sense" i.e. negative parameter estimates for the effect of price on utility.

All models fit well, and in reasonable time. However, in (2) i've truncated the data set a little because i ran into a bracket nesting error on the full data set. Would keen to know how to replace my for-loop here with scan.... but i wasn't sure how to do that...,.?

I think i'll likely add some more to the text-write up, but would like some interim feedback if you have any on the modelling design.

@drbenvincent in case of interest.

review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2023-06-19T17:19:18Z ----------------------------------------------------------------

Line #21.        p_ = pm.Deterministic("p", pm.math.softmax(s, axis=1), dims=("obs", "alts_probs"))

We shouldn't wrap anything in a Deterministic that we are not going to use. For large models/datasets this can slowdown sampling quite a lot and make it seem "slower" than it actually is.


_NathanielF commented on 2023-06-19T17:42:53Z_ ----------------------------------------------------------------

But i do use the probabilities 'p' in nearly every plot, It's kind of one of the main quantities of interest. I can remove the Deterministic wrap the utilities 'u' for the reason you mention...

_NathanielF commented on 2023-06-19T19:51:34Z_ ----------------------------------------------------------------

Removed any redundant Deterministics.