ARC Memory Issue - Githubissues

danielsclint commented 3 years ago

When testing a migration for ARC to ActivitySim v1.0, the model is hitting a memory error. Any thoughts on a resolution?

The model runs fine with a smaller set of TAZ.

danielsclint commented 3 years ago

@bstabler: Running the ARC model in 'training' mode worked, and the model ran to completion. It also created a chunk_cache.csv in the output folder.

I started a second run with the chunk_cache.csv and chunk_training_mode: production. The model when it starts immediately deletes the contents of the output folder including the chunk_cache.csv. Is this the expected behavior?

In addition, the documentation says that the chunk_tranining_mode should be set with production AND the desired num_processors and chunk_size. Isn't the training supposed come up with those, so I don't need to set them in the settings.yaml?

danielsclint commented 3 years ago

Okay, so apparently, the chunk_cache.csv file needs to moved to the cache folder in the outputs directory. At a minimum, documentation should be updated.

stefancoe commented 3 years ago

@danielsclint & @bstabler - I ran the PSRC model in 'training' mode and it created the 'chunk_cache.csv' file in the 'cache' folder in the outputs directory. I then re-ran the model in 'production' mode using the same output directory and it did not delete the 'cache' directory nor the 'chunk_cache.csv' file and ran as expected.

danielsclint commented 3 years ago

I can confirm the behavior that Stefan mentioned. I'm able to generate the chunk_cache.csv file and maintain it through time. I'm still running into a memory error in the ARC ActivitySim implementation in work location choice. In the images below, I ran the model through work location in 'training' mode, and it worked (albeit a little slow). I then changed the chunk_training_mode to 'production' and commented out the num_processors and chunk_size. The log files indicate it found the chunk_cache.csv.

However, when I get to the workplace location model, ActivitySim blows through the top of the RAM (see images).

bstabler commented 3 years ago

You need to set num_processors and chunk_size. Try setting num_processors and chunk_size to something like 80% of available processors and machine RAM and also set MKL_NUM_THREADS=1.

guyrousseau commented 3 years ago

@danielsclint was this test performed on an ARC server or a WSP server? I don't recognize the APPS directory name, so I assume this test was on a WSP server? If so, are those specs similar to the specs from our ARC servers?

danielsclint commented 3 years ago

@guyrousseau- I missed this earlier. The test are being run on a WSP server. Raghu and Jonathan have been running similar tests on the ARC servers. I'm not sure on the specs of the ARC servers. The WSP server is 48 cores and 528Gb of memory.

danielsclint commented 3 years ago

This issue has been resolved with my latest tests and chunking. Setting the num_processors and chunk_size to about 80% of available resources helped.

guyrousseau commented 3 years ago

Thanks @danielsclint, to answer your question, ARC server specs are listed here: https://github.com/ActivitySim/activitysim/wiki/Server-Specs

wusun2 commented 3 years ago

@guyrousseau and @danielsclint , have you done ActivitySim testing on ARC 48 physical core and 512 RAM and RSG's 32 core and 512 RAM servers? If so, any performance differences?

danielsclint commented 3 years ago

Hi @wusun2 - We just finished with the proof-of-concept deployment of ActivitySim version 1.0.2 on the WSP server for ARC. The latest chunking enhancements are an improvement. In this Google Spreadsheet, I’ve summarized the runtime results from three runs conducted this summer. All runs are multi-processor on a 100 percent sample for the ARC region on our 48 core / 528 Gb RAM machine at WSP. The run on July 9 used the ActivitySim v1.0.1 release (before chunking improvements), and this run took nearly 12 hours to complete. Approximately 40 percent of the time was spent in tour scheduling and nearly another 40 percent spent in trip_destination_choice.

On the same machine, I ran two additional runs this week with v1.0.2 (or its pre-release on develop). The first run was using the exact same model setup as the July 9 runs with just the updated ActivitySim library with the chunking enhancements. This run took just over 5 hours to complete, a 55 percent reduction in run time. The largest runtime savings came in the three models mentioned above. The second run implemented the tour_departure_and_duration_segments.csv reducing the number of logsum calculations in mandatory tour scheduling. This dropped the runtime 8 percent to under 5 hours with all of the time saving accruing in the mandatory_tour_scheduling component of the model.

wusun2 commented 3 years ago

@danielsclint , this is very helpful. It is encouraging to see the recent improvements (seems beyond just chunking improvements) resulted in better performance. @esanchez01 's tests proved that SANDAG's best server with 320GB RAM doesn't performance as well as WSP's and RSG's ~500G RAM servers. We plan to purchase similar or even better servers. Lastly, is the improvement on tour departure and duration segments in v1.0.2 on develop branch? The master branch will include both improvements in v1.0.2 and the departure and duration segments right?

bstabler commented 3 years ago

@wusun2 - yes, the tour departure and duration segments feature is in the develop (and now master) branch.

ActivitySim / activitysim

ARC Memory Issue #428