ActivitySim / activitysim

An Open Platform for Activity-Based Travel Modeling
https://activitysim.github.io
BSD 3-Clause "New" or "Revised" License
194 stars 99 forks source link

Feature suggestion: Econometrically consistent simulation of multinomial and nested logit choices #568

Open janzill opened 2 years ago

janzill commented 2 years ago

I'd like to get feedback on a proposed pull request that would add econometrically consistent simulation of multinomial and nested logit choices.

Background

ActivitySim simulates the results of its model steps, i.e. it makes a concrete choice for each decision maker for each sub-model when running a model. The way this is currently done is with a common Monte Carlo method, see e.g. chapter 3.1.8.3 here. Fixing random seeds per decision maker and model then leads to reproducible runs, where no changes in observed utilities of alternatives means no changes in the choice selected. However, when viewed at the individual level, this can lead to seemingly illogical choices between scenarios: For example, a localised change in the observed utility of a single zone in destination choice (like better access or more attractions) can lead to some people changing destinations in zones that are far away and shouldn't be influenced by this. Additionally, making an alternative more attractive (i.e. increasing its observed utility compared to other alternatives) in a scenario should not lead to any people switching away from that alternative, but it can happen in a Monte Carlo world.

We suggest to avoid this by drawing from the probability density function of the unobserved part of the utility directly. This is straight forward for multinomial logit models, where unobserved components are independent and the cumulative distribution function is invertible, but not for nested logit models where some correlation between unobserved components exists. However, it is also possible in the latter case - we have mathematically derived a way to do this, and we have implemented a prototype in a fork of ActivitySim and numerically verified our methodology. We are planning on submitting a paper to ATRF in the upcoming weeks. Note that our micro-simulated trip-based model works this way and has been used on numerous projects in Australia; it is not open-source though.

What we have done so far

We have implemented this methodology for the nested logit model and run some numerical verification for the small MTC test model, specifically the trip mode choice model.

What we want to do

If the methodology is of interest to the consortium, we would create a pull request for further discussion. We would round out the implementation to add econometrically consistent drawing for multinomial models, add functionality to expose a parameter to choose the way drawing is performed (we suggest to add our implementation as an additional feature, not to replace the existing methodology), and add tests and some documentation.

Additional notes

The amount of additional code is small - the implementation re-uses existing infrastructure, like random number seeding, and only adds a handful of small functions. Regarding the relevance of the change, this can help with communicating results to a non-specialist audience where seemingly illogical changes can have a negative impact on trust and distract from the main message of results. It might also help with model stability and convergence, but we have not investigated this in detail yet.

I'd appreciate any thoughts or comments!

jfdman commented 2 years ago

Sounds like a great idea. I am curious whether the randomly-drawn unobserved utility component for each alternative should be included when calculating the logsum of the model, in the case that the logsum is necessary for upstream model components. For example, if a decision-maker has a relatively high unobserved utility for transit, should their mode choice logsum reflect this such that destinations with relatively better transit service have a higher utility for selection in destination choice? If so this might have implications on model calibration/validation. And we would want to make sure that the functionality could be turned off in the case of building estimation data bundles. Appreciate hearing any thoughts you might have. Thanks! -Joel

janzill commented 2 years ago

Hi @jfdman, thanks for the comment! At the moment, the implementation is such that logsums are independent of random utility terms (i.e. they are expectation values of the maximum utility for a given choice model) and as such no information on the individual random components is passed upstream. It's really just a replacement to how the draws for MNL and NL models are performed currently and logsums are identical for both implementations.

At a practical level, our implementation does not draw full error terms for the individual utilities of alternatives directly and then chooses the alternative with maximum full utility; instead we generate choices as a product of conditional choices with conditional random terms - I have a draft manuscript if you are interested in details. Therefore, writing down the full utility explicitly is currently not possible.

That being said, I think the idea of including that information at an individual level in upstream models is interesting but I haven't thought about it before. There is a way to write the unobserved utility as a sum of independent terms and then draw these directly and write down a realisation for the full utility. However, the probability density function for some of those terms is a series expansion and drawing from these is possible but computationally more expensive I believe.

Then there is the question if a formulation using the full utility to calculate logsums at a lower level and including these at an upper level is consistent with writing these models as two conditional MNL models. I don't think it is given the logsums come in as distributions of terms of the form $max(V_j + \epsilon_j)$ where $\epsilon_j$ is independent Gumbel distributed; using the unobserved parts directly (i.e. taking the max of $V_j + \epsilon_j$) would lead to a different formulation without logsums but with error term components being shared between models. I think it would be interesting to work through this idea in a bit more detail - a whiteboard would be really handy for this :) May I suggest to have this discussion in the new Discussions forum? It is not clear to me how it would work yet but I think it would be a much bigger change than what is suggested as part of this issue and this issue has merit by itself.

jfdman commented 2 years ago

Hi @janzill, thanks for the clarification. I'd be curious to see a draft of the paper if possible. I promise to keep it confidential. Thanks!

janzill commented 2 years ago

I've sent you an email @jfdman

janzill commented 1 year ago

the conference paper has now been published, see https://australasiantransportresearchforum.org.au/frozen-randomness-at-the-individual-utility-level for methodology and study of influence on work-location choice models. We also include some early results on model convergence in https://aetransport.org/past-etc-papers/conference-papers-2022?abstractId=7583&state=b (not peer reviewed)

janzill commented 1 year ago

I think the title of this issue might be misleading - we do not trace individual preferences through different models, but only suggest to draw realisations for each model and chooser in a different way. The difference is that the suggested method has a micro-economic interpretation.