Discrete choice - Githubissues

Marking this one as ready for review @twiecki and @ricardoV94.

I've found a representation of the discrete choice models that i'm pretty happy with. I'm demonstrating them on two different but canonical data sets.

(1) On the choice of heating systems, where i focus just on specifying the utility matrix for the individual alternatives (2) On the choice of crackers with repeated decisions over the same decision maker.

In (1) i demonstrate how you can add alternative specific parameters e.g. intercepts and beta parameters for income on the specific alternative.

In (2) i focus on how you can add person specific modifications of utility and how you can use prior constraints in the Bayesian context to ensure that the parameter estimates "make sense" i.e. negative parameter estimates for the effect of price on utility.

All models fit well, and in reasonable time. However, in (2) i've truncated the data set a little because i ran into a bracket nesting error on the full data set. Would keen to know how to replace my for-loop here with scan.... but i wasn't sure how to do that...,.?

I think i'll likely add some more to the text-write up, but would like some interim feedback if you have any on the modelling design.

@drbenvincent in case of interest.

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2023-06-19T17:19:18Z ----------------------------------------------------------------

Line #21.        p_ = pm.Deterministic("p", pm.math.softmax(s, axis=1), dims=("obs", "alts_probs"))

We shouldn't wrap anything in a Deterministic that we are not going to use. For large models/datasets this can slowdown sampling quite a lot and make it seem "slower" than it actually is.

_NathanielF commented on 2023-06-19T17:42:53Z_ ----------------------------------------------------------------

But i do use the probabilities 'p' in nearly every plot, It's kind of one of the main quantities of interest. I can remove the Deterministic wrap the utilities 'u' for the reason you mention...

_NathanielF commented on 2023-06-19T19:51:34Z_ ----------------------------------------------------------------

Removed any redundant Deterministics.

review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2023-06-19T17:19:18Z ----------------------------------------------------------------

Define what is a "marginal rate of substitution"?

NathanielF commented on 2023-06-19T17:43:40Z ----------------------------------------------------------------

Yep. Will adjust next commit.

NathanielF commented on 2023-06-19T19:50:32Z ----------------------------------------------------------------

Resolved

review-notebook-app[bot] commented 1 year ago

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2023-06-19T17:19:19Z ----------------------------------------------------------------

Line #20.        for id, indx in zip(uniques, range(len(uniques))):

Why is the loop needed? Couldn't grasp immediately why fancy index can't do the job.

Do you have a non-square matrix of some sort? If so can it be padded with zeros or whatever to make it work as if was square?

Reducing the number of stacks would probably speedup a lot this kind of model

_NathanielF commented on 2023-06-19T17:53:30Z_ ----------------------------------------------------------------

It's possibly not needed. I wrote it in the way which was most intuitive to me. I'd be keen to understand how a fancy index approach could work?

I don't think we have any non-square matrices that we're looking to construct. But we do have varying length sets of rows per person. So we might have something like:

| Person ID | Choice ID | Choice |

|-----------|-----------|---------|

| Person 1 | 1 | Nabisco |

| Person 1 | 2 | Keebler |

| Person 2 | 1 | Nabisco |

| Person 2 | 2 | Nabisco |

| Person 2 | 3 | Nabisco |

So i'm building per person 3 equations and i want to add an alternative specific beta-coefficient to the alternative specific intercept alpha. In this case i'd need the person specific beta coefficient to be applied to the first two rows and then the next three... It was just allot of indexes to keep in my head, but if you have ideas about how to make it cleaner, that'd be great!? I'll experiment a little more, but the loop was primarily because it was easier for me to think through and follow....

_NathanielF commented on 2023-06-19T19:00:36Z_ ----------------------------------------------------------------

Yeah, ok I think i have it now.

_NathanielF commented on 2023-06-19T19:51:11Z_ ----------------------------------------------------------------

Updated in latest commit. Thanks for the push. This is much neater.

NathanielF commented 1 year ago

But i do use the probabilities 'p' in nearly every plot, It's kind of one of the main quantities of interest. I didn't wrap the utilities for the reason you mention...

View entire conversation on ReviewNB

NathanielF commented 1 year ago

Yep. Will adjust next commit.

View entire conversation on ReviewNB

NathanielF commented 1 year ago

It's possibly not needed. I wrote it in the way which was most intuitive to me. I'd be keen to understand how a fancy index approach could work?

I don't think we have any non-square matrices that we're looking to construct. But we do have varying length sets of rows per person. So we might have something like:

| Person ID | Choice ID | Choice |

|-----------|-----------|---------|

| Person 1 | 1 | Nabisco |

| Person 1 | 2 | Keebler |

| Person 2 | 1 | Nabisco |

| Person 2 | 2 | Nabisco |

| Person 2 | 3 | Nabisco |

So i'm building per person 3 equations and i want to add an alternative specific beta-coefficient to the alternative specific intercept alpha. In this case i'd need the person specific beta coefficient to be applied to the first two rows and then the next three another coefficient... It was just allot of indexes to keep in my head, but if you have ideas about how to make it cleaner, that'd be great!?

View entire conversation on ReviewNB

NathanielF commented 1 year ago

Oh actually i think i see it now!

View entire conversation on ReviewNB

NathanielF commented 1 year ago

Yeah, ok I think i have it now.

View entire conversation on ReviewNB

NathanielF commented 1 year ago

Resolved

View entire conversation on ReviewNB

NathanielF commented 1 year ago

Updated in latest commit. Thanks for the push. This is much neater.

View entire conversation on ReviewNB

NathanielF commented 1 year ago

Removed any redundant Deterministics.

View entire conversation on ReviewNB

NathanielF commented 1 year ago

Incorporated all those changes now @ricardoV94 , thanks for the nudge on the indexing. It's much neater now than using the for loop. Plus I was able to use the full crackers dataset and it fits fast!.

NathanielF commented 1 year ago

Giving this one another gentle nudge: @twiecki and @ricardoV94

Let me know if there is some concern or anything outstanding?

NathanielF commented 1 year ago

Made those changes @OriolAbril . Thanks for the nudge. Much neater

NathanielF commented 12 months ago

Thanks @OriolAbril for the review. I've made those changes now. I've slightly updated and i think improved the final plot and I think it should be ready now for merging.

OriolAbril commented 12 months ago

Also, not sure if you are already aware, but while the author metadata isn't very prominently displayed, it is used for aggregating posts by author for example, here is the link to your page: https://www.pymc.io/projects/examples/en/latest/blog/author/nathaniel-forde.html

NathanielF commented 12 months ago

Also, not sure if you are already aware, but while the author metadata isn't very prominently displayed, it is used for aggregating posts by author for example, here is the link to your page: https://www.pymc.io/projects/examples/en/latest/blog/author/nathaniel-forde.html

I did not know that! That's lovely to see!

pymc-devs / pymc-examples

Discrete choice #544

Discrete Choice Modelling

Helpful links