ActivitySim / activitysim

An Open Platform for Activity-Based Travel Modeling
https://activitysim.github.io
BSD 3-Clause "New" or "Revised" License
189 stars 96 forks source link

correct CDAP #116

Closed bstabler closed 7 years ago

bstabler commented 7 years ago

CDAP is supposed to only consider the interactions between up to 5 HH members and then apply some additional utility terms after that. We will review the MTC TM1 code and UECs and correct this. CDAP also currently loops by HHs but this is inefficient.

We will likely re-implement it as a series of batch vectorized calculations:

The key is to organize the problem into a series of batch table operations

toliwaga commented 7 years ago

According to the documentation for applyModelForExtraHhMembers in mtctm2.abm.ctramp HouseholdCoordinatedDailyActivityPatternModel.java choices for extra household members should be made according to a table of fixed proportionate probabilities based on person type:

* Applies a simple choice from fixed proportions by person type for members of
* households with more than 5 people who are not included in the CDAP model. The
* choices of the additional household members are independent of each other.

Why does it do this rather than choosing activities for the extra members by applying the individual utilities, ignoring the interaction utilities?

It is easy for me to implement this either way...

bstabler commented 7 years ago

The short answer is because that is how the model was estimated. Here are the hard wired CDAP 6+ persons proportions from MTC TM1. The rows are person types (the 2nd row is person type 1) and the columns are M, N, H. We should move this into a input CSV table.

public final double[][] CDAP_6_PLUS_PROPORTIONS = { { 0.0, 0.0, 0.0 }, { 0.79647, 0.09368, 0.10985 }, { 0.61678, 0.25757, 0.12565 }, { 0.69229, 0.15641, 0.15130 }, { 0.00000, 0.67169, 0.32831 }, { 0.00000, 0.54295, 0.45705 }, { 0.77609, 0.06004, 0.16387 }, { 0.68514, 0.09144, 0.22342 }, { 0.14056, 0.06512, 0.79432 } };

bstabler commented 7 years ago

We implemented CDAP in a slightly different way than described in the first comment. See https://github.com/UDST/activitysim/wiki/Project-Meeting-2016.09.23

wusun2 commented 7 years ago

SANDAG model has the same proportions. It seems in each sub array, the numbers are the probabilities of choosing M, N, and H for a 6 and plus household member? Does this mean the code handles up to 14-person household (5+ 9 brackets in the array)? What does {0.0, 0.0, 0.0} represent?

toliwaga commented 7 years ago

I believe that all extra household members are treated the same, and the choice depends only on ptype. So this is a three column (M. N, H) array indexed by ptype.

The first row is blank because the array is zero-based index and the first ptype is 1.

And so it will handle households of any size.

wusun2 commented 7 years ago

@toliwaga this makes sense to me now. Thanks.

toliwaga commented 7 years ago

I am running the full dataset and cdap completed successfully in just under an hour.

cdap_activity       H        M        N      All
ptype
1              181421  2012280   226252  2419953
2               47603   320675   178281   546559
3               31654   303936    87907   423497
4              145474        0  1125932  1271406
5              146983        0   636116   783099
6               16462   155338    20269   192069
7               65168   711504   113030   889702
8               68160   324413   134476   527049
All            702925  3828146  2522263  7053334
Time to execute step 'cdap_simulate': 3521.04 s
Total time to execute iteration 1 with iteration value None: 3521.04 s
bstabler commented 7 years ago

this is great. We will be able to run this quite quickly then once we thread and/or distribute.

toliwaga commented 7 years ago

I made a tweak that reduce runtime from 60 mins to 52 minutes. That is probably good enough for now.