SANDAG / ABM

Sandag ABM
https://github.com/SANDAG/ABM/wiki
20 stars 21 forks source link

Parking costs applied to all Mobility Hub MGRAs runs out of memory #132

Closed pmadhav-usfca closed 2 months ago

pmadhav-usfca commented 2 months ago
WARNING - MemoryError exception running parking_location model: Unable to allocate 65.0 GiB for an array with shape (51, 171019212) and data type int64
ERROR - mp_tasks - mp_households_38 - MemoryError exception caught in mp_run_simulation: Unable to allocate 65.0 GiB for an array with shape (51, 171019212) and data type int64
ERROR - 
---
Traceback (most recent call last):
  File "\\jupiter\abm\dev\activitysim\activitysim\core\mp_tasks.py", line 930, in mp_run_simulation
    run_simulation(queue, step_info, resume_after, shared_data_buffer)
  File "\\jupiter\abm\dev\activitysim\activitysim\core\mp_tasks.py", line 882, in run_simulation
    raise e
  File "\\jupiter\abm\dev\activitysim\activitysim\core\mp_tasks.py", line 879, in run_simulation
    pipeline.run_model(model)
  File "\\jupiter\abm\dev\activitysim\activitysim\core\pipeline.py", line 529, in run_model
    orca.run([step_name])
  File "C:\Anaconda3\envs\asim_baydag\lib\site-packages\orca\orca.py", line 2177, in run
    step()
  File "C:\Anaconda3\envs\asim_baydag\lib\site-packages\orca\orca.py", line 973, in __call__
    return self._func(**kwargs)
  File "\\jupiter\abm\dev\activitysim\activitysim\abm\models\parking_location_choice.py", line 330, in parking_location
    parking_locations, save_sample_df = run_parking_destination(
  File "\\jupiter\abm\dev\activitysim\activitysim\abm\models\parking_location_choice.py", line 246, in run_parking_destination
    choices, destination_sample = choose_parking_location(
  File "\\jupiter\abm\dev\activitysim\activitysim\abm\models\parking_location_choice.py", line 174, in choose_parking_location
    destination_sample = logit.interaction_dataset(
  File "\\jupiter\abm\dev\activitysim\activitysim\core\logit.py", line 318, in interaction_dataset
    alts_sample = alternatives.take(sample).copy()
  File "C:\Anaconda3\envs\asim_baydag\lib\site-packages\pandas\core\generic.py", line 6368, in copy
    data = self._mgr.copy(deep=deep)
  File "C:\Anaconda3\envs\asim_baydag\lib\site-packages\pandas\core\internals\managers.py", line 649, in copy
    res = self.apply("copy", deep=deep)
  File "C:\Anaconda3\envs\asim_baydag\lib\site-packages\pandas\core\internals\managers.py", line 352, in apply
    applied = getattr(b, f)(**kwargs)
  File "C:\Anaconda3\envs\asim_baydag\lib\site-packages\pandas\core\internals\blocks.py", line 549, in copy
    values = values.copy()
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 65.0 GiB for an array with shape (51, 171019212) and data type int64
aletzdy commented 2 months ago

Seems like the number of alternatives almost quadruples compared to the original scenario. I suggest trying to segment the choosers table in the parking location choice model. We are currently assigning one value to the parking_segment column for all rows, essentially running the choose_parking_location for all choosers at the same time.

We can try segmenting the choosers into 2 random groups to start with by defining the parking_segment in the model's preprocessor as:

['segment_1' if i < len(df)/2 else 'segment_2' for i in range(len(df))]

and see if the model runs. We can try 3-5 segments and test.

tagging @dhensle to check this issue out.

aletzdy commented 2 months ago

Adding further to the suggested solution above:

if we want to segment the parking location model, we need to also update parking_location_choice.csv to reflect the changed segment names. Currently the spec has a no_segmentation column only which matches the parking_segment value set in the preprocessor. if we go by my suggestion, we need to add segment_1 and segment_2 columns (and take out the no_segmentation) and use the same coef values for all segments so the results do not differ compared to the no segmentation approach.

dhensle commented 2 months ago

We substantially reduced the memory needed for the parking location choice model in the Phase 9 work with the consortium already. You can see the code that was done here: https://github.com/ActivitySim/activitysim/pull/849/commits/1808704798c5dc5c51098ef80d65cde4b6a531b1

The reason why the memory ballooned is because this model merged the subset of the landuse table that contains the parking zones with the trips. It included every single column in the landuse and trips table in the merge. However, we really only need a small subset of those in to calculate the utilities. The commit above searches the utility spec and removes unused columns.

I made an analogous commit into the BayDAG_estimation branch here: https://github.com/SANDAG/activitysim/commit/e4dc5ac747f09121e617e314a08be3cf70a102bf

bhargavasana commented 2 months ago

Fixed by BayDAG updates, updates to land use prep tool (https://github.com/SANDAG/landuse_prep_tool/commit/212ba61bca3eef77c99f6b06e99e2ead1e9c8fff) and commit 61aa153