Closed jpn-- closed 9 months ago
Some initial first steps:
Config updates: https://github.com/ActivitySim/activitysim-prototype-mtc/pull/3 Code updates: https://github.com/ActivitySim/activitysim/pull/806
I looked into the memory usage of vehicle type model in the non-Sharrow mode. When running MTC extended model with 25% population, the interaction_df
(the joined data frame of choosers and alternatives) of the first vehicle choice uses 212 GB of RAM, which explains why we got a memory error when running 100% population.
Below is a table of memory taken by each column in the interaction_df
. The string columns are already converted to pandas categorical. No column stands out as being memory intensive, it's just that there are too many columns in this table and it adds up. Removing columns that are not used in the utility calculation will help reducing memory.
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
Column | Dtype | Memory (GB) -- | -- | -- Total | | 212.1 index | int64 | 2.4 body_type_Car | uint8 | 0.3 body_type_Motorcycle | uint8 | 0.3 body_type_Pickup | uint8 | 0.3 body_type_SUV | uint8 | 0.3 body_type_Van | uint8 | 0.3 age_1 | uint8 | 0.3 age_10 | uint8 | 0.3 age_11 | uint8 | 0.3 age_12 | uint8 | 0.3 age_13 | uint8 | 0.3 age_14 | uint8 | 0.3 age_15 | uint8 | 0.3 age_16 | uint8 | 0.3 age_17 | uint8 | 0.3 age_18 | uint8 | 0.3 age_19 | uint8 | 0.3 age_2 | uint8 | 0.3 age_20 | uint8 | 0.3 age_3 | uint8 | 0.3 age_4 | uint8 | 0.3 age_5 | uint8 | 0.3 age_6 | uint8 | 0.3 age_7 | uint8 | 0.3 age_8 | uint8 | 0.3 age_9 | uint8 | 0.3 fuel_type_BEV | uint8 | 0.3 fuel_type_Diesel | uint8 | 0.3 fuel_type_Gas | uint8 | 0.3 fuel_type_Hybrid | uint8 | 0.3 fuel_type_PEV | uint8 | 0.3 body_type | category | 0.3 age | int32 | 1.2 fuel_type | category | 0.3 vehicle_year | int64 | 2.4 NumMakes | int64 | 2.4 NumModels | int64 | 2.4 MPG | float64 | 2.4 Range | int64 | 2.4 NewPrice | float64 | 2.4 auto_operating_cost | float64 | 2.4 co2gpm | float64 | 2.4 vehicle_type | category | 0.6 household_id | int64 | 2.4 vehicle_num | int64 | 2.4 home_zone_id | int64 | 2.4 income | int64 | 2.4 hhsize | int64 | 2.4 HHT | int64 | 2.4 auto_ownership | int32 | 1.2 num_workers | int64 | 2.4 sample_rate | float64 | 2.4 income_in_thousands | float64 | 2.4 income_segment | int32 | 1.2 median_value_of_time | float64 | 2.4 hh_value_of_time | float64 | 2.4 num_non_workers | int64 | 2.4 num_drivers | int8 | 0.3 num_adults | int8 | 0.3 num_children | int8 | 0.3 num_young_children | int8 | 0.3 num_children_5_to_15 | int8 | 0.3 num_children_16_to_17 | int8 | 0.3 num_college_age | int8 | 0.3 num_young_adults | int8 | 0.3 non_family | bool | 0.3 family | bool | 0.3 home_is_urban | bool | 0.3 home_is_rural | bool | 0.3 hh_work_auto_savings_ratio | float32 | 1.2 DISTRICT | int64 | 2.4 SD | int64 | 2.4 county_id | int64 | 2.4 TOTHH | int64 | 2.4 TOTPOP | int64 | 2.4 TOTACRE | float64 | 2.4 RESACRE | float64 | 2.4 CIACRE | float64 | 2.4 TOTEMP | int64 | 2.4 AGE0519 | int64 | 2.4 RETEMPN | int64 | 2.4 FPSEMPN | int64 | 2.4 HEREMPN | int64 | 2.4 OTHEMPN | int64 | 2.4 AGREMPN | int64 | 2.4 MWTEMPN | int64 | 2.4 PRKCST | float64 | 2.4 OPRKCST | float64 | 2.4 area_type | int64 | 2.4 HSENROLL | float64 | 2.4 COLLFTE | float64 | 2.4 COLLPTE | float64 | 2.4 TOPOLOGY | int64 | 2.4 TERMINAL | float64 | 2.4 household_density | float64 | 2.4 employment_density | float64 | 2.4 density_index | float64 | 2.4 is_cbd | bool | 0.3 TOTENR_univ | float64 | 2.4 ext_work_share | float64 | 2.4 RETEMPN_scaled | float64 | 2.4 FPSEMPN_scaled | float64 | 2.4 HEREMPN_scaled | float64 | 2.4 OTHEMPN_scaled | float64 | 2.4 AGREMPN_scaled | float64 | 2.4 MWTEMPN_scaled | float64 | 2.4 TOTEMP_scaled | float64 | 2.4 auPkRetail | float64 | 2.4 auPkTotal | float64 | 2.4 auOpRetail | float64 | 2.4 auOpTotal | float64 | 2.4 trPkRetail | float64 | 2.4 trPkTotal | float64 | 2.4 trOpRetail | float64 | 2.4 trOpTotal | float64 | 2.4 nmRetail | float64 | 2.4 nmTotal | float64 | 2.4 already_owned_veh | category | 0.6 total_hh_dist_to_work | float32 | 1.2 total_hh_dist_to_work_cap | float64 | 2.4 avg_hh_dist_to_work | float32 | 1.2 hh_per_mi | float64 | 2.4 hh_veh_gt_drivers | int32 | 1.2 num_hh_veh_owned | float64 | 2.4 num_hh_Van | float64 | 2.4 num_hh_SUV | float64 | 2.4 num_hh_Pickup | float64 | 2.4 num_hh_Motorcycle | float64 | 2.4 num_hh_Hybrid | float64 | 2.4 num_hh_BEV | float64 | 2.4 num_hh_PEV | float64 | 2.4 num_hh_EV | float64 | 2.4
When running the MTC extended model with full size zones and full size population, in the non-Sharrow mode, the model failed in the vehicle type choice model with a memory error on a Windows machine with 512 GB RAM; in the Sharrow mode, the model completed the vehicle type choice model in 17 hours.
Need to improve: