pymc-devs / pymc-examples

Examples of PyMC models, including a library of Jupyter notebooks.
https://www.pymc.io/projects/examples/en/latest/
MIT License
259 stars 212 forks source link

Discrete Choice Models and Utility Estimation #543

Closed NathanielF closed 11 months ago

NathanielF commented 1 year ago

Discrete Choice Models and Utility Estimation:

Why should this notebook be added to pymc-examples?

In Economics there is a tradition of using discrete choice models over many agents and products to infer properties about the utility function of the agents, and abstracting over all the agents make predictions about product share within the market. These have traditionally been modelled where closed form solutions allowed with logistic type and ordinal choice models. I cover Ordinal regression in #533, but there is a separate stream of research dealing with Nested logit and GEV style discrete choice models that i think could be more easily modelling using a Bayesian framework.

Not sure at the minute of the details, but plan on looking into: https://www.amazon.co.uk/Discrete-Choice-Methods-Simulation-Kenneth/dp/0521816963/ref=sr_1_1?crid=3DBCCE669W43Z&keywords=kenneth+train&qid=1681378209&sprefix=kenneth+train%2Caps%2C70&sr=8-1&asin=0521816963&revisionId=&format=4&depth=1

I think it could be a useful example, and in particular one where the Bayesian approach is far easier to understand than the traditional frequentist account. (I'm hoping). For comparison: https://pyblp.readthedocs.io/en/stable/background.html

Suggested categories:

ricardoV94 commented 1 year ago

Is this related to RL? We have https://www.pymc.io/projects/examples/en/latest/case_studies/reinforcement_learning.html

If so, would it be important that any new example discuss the similarities/ differences?

NathanielF commented 1 year ago

No I don't think it is related to reinforcement learning. It's not about optimising choice, but about estimating the latent utility function which determines choices.

ricardoV94 commented 1 year ago

Hmm, I mean RL is a procedure to figure out "the utility function" of a problem

I guess the difference when modelling people's choices is that in a RL setting you have snapshots of people's learning/ choice trajectories instead of the "final choices" once learning is concluded (or the utility function is fixed)?

NathanielF commented 1 year ago

I mean, I think at the very least they are very different traditions. Typically used to motivate policy choices e.g. famously the introduction of the BART transport system.

Yes, this approach is definitely after the fact... but I'll be able to say more once I've read up on the techniques. Will circle back with a clearer motivation

ricardoV94 commented 1 year ago

I believe the two topics may be very distinct (certainly in background). I was just phishing for connections if there were obvious ones :)

Mostly being curious

NathanielF commented 1 year ago

@ricardoV94 spent a bit more time looking into this and experimenting. Added a draft pull request above (still experimenting) but found Stan implementations referenced in the draft PR. I think there is something interesting here. Have a very simple model working on fake data with the following structure:

image

Need to write up the motivation for these models more, but i think there are quite distinct from anything in examples gallery at the moment.

NathanielF commented 1 year ago

Interesting to see a pytorch implementation here too:https://twitter.com/GSBsiLab/status/1671534654019469312?t=3N8mg2yK4CK7Vh3slodJDw&s=19