Closed bstabler closed 8 years ago
According to the documentation for applyModelForExtraHhMembers in mtctm2.abm.ctramp HouseholdCoordinatedDailyActivityPatternModel.java choices for extra household members should be made according to a table of fixed proportionate probabilities based on person type:
* Applies a simple choice from fixed proportions by person type for members of
* households with more than 5 people who are not included in the CDAP model. The
* choices of the additional household members are independent of each other.
Why does it do this rather than choosing activities for the extra members by applying the individual utilities, ignoring the interaction utilities?
It is easy for me to implement this either way...
The short answer is because that is how the model was estimated. Here are the hard wired CDAP 6+ persons proportions from MTC TM1. The rows are person types (the 2nd row is person type 1) and the columns are M, N, H. We should move this into a input CSV table.
public final double[][] CDAP_6_PLUS_PROPORTIONS = { { 0.0, 0.0, 0.0 }, { 0.79647, 0.09368, 0.10985 }, { 0.61678, 0.25757, 0.12565 }, { 0.69229, 0.15641, 0.15130 }, { 0.00000, 0.67169, 0.32831 }, { 0.00000, 0.54295, 0.45705 }, { 0.77609, 0.06004, 0.16387 }, { 0.68514, 0.09144, 0.22342 }, { 0.14056, 0.06512, 0.79432 } };
We implemented CDAP in a slightly different way than described in the first comment. See https://github.com/UDST/activitysim/wiki/Project-Meeting-2016.09.23
SANDAG model has the same proportions. It seems in each sub array, the numbers are the probabilities of choosing M, N, and H for a 6 and plus household member? Does this mean the code handles up to 14-person household (5+ 9 brackets in the array)? What does {0.0, 0.0, 0.0} represent?
I believe that all extra household members are treated the same, and the choice depends only on ptype. So this is a three column (M. N, H) array indexed by ptype.
The first row is blank because the array is zero-based index and the first ptype is 1.
And so it will handle households of any size.
@toliwaga this makes sense to me now. Thanks.
I am running the full dataset and cdap completed successfully in just under an hour.
cdap_activity H M N All
ptype
1 181421 2012280 226252 2419953
2 47603 320675 178281 546559
3 31654 303936 87907 423497
4 145474 0 1125932 1271406
5 146983 0 636116 783099
6 16462 155338 20269 192069
7 65168 711504 113030 889702
8 68160 324413 134476 527049
All 702925 3828146 2522263 7053334
Time to execute step 'cdap_simulate': 3521.04 s
Total time to execute iteration 1 with iteration value None: 3521.04 s
this is great. We will be able to run this quite quickly then once we thread and/or distribute.
I made a tweak that reduce runtime from 60 mins to 52 minutes. That is probably good enough for now.
CDAP is supposed to only consider the interactions between up to 5 HH members and then apply some additional utility terms after that. We will review the MTC TM1 code and UECs and correct this. CDAP also currently loops by HHs but this is inefficient.
We will likely re-implement it as a series of batch vectorized calculations:
The key is to organize the problem into a series of batch table operations