ActivitySim / activitysim-prototype-mtc

The canonical prototype MTC model.
BSD 3-Clause "New" or "Revised" License
1 stars 2 forks source link

Full Scale Performance: Sharrow On #12

Open dhensle opened 7 months ago

dhensle commented 7 months ago

This is the issue to report on memory usage and runtime performance when using sharrow...

dhensle commented 7 months ago

First ran sharrow compile with the following settings:

Run completed in 76 minutes. log_sh_compile.zip

Then ran in production mode

Run completed in 7.7 hours with a memory peak at about 163 GB in trip destination. image

logs_sh_full.zip

Followed by multiprocessing

Run completed in 110 minutes (1.8 hours). log_sh_full_mp.zip

dhensle commented 5 months ago

Ran with 100% households and sharrow on, single process.

Run completed in 1090.3 minutes (18.2 hours). This is much longer than the previous time posted above of 7.7 hours.

Current run was performed using PR #867 commit c9d4205.

image log.zip

Timing statements comparing the old run above to this current run show large differences mainly in the destination models: image

Will try again with the main branch of ActivitySim instead of PR 867 to see if that makes a difference.

dhensle commented 5 months ago

Ran using an older environment that uses the current version of ActivitySim (main@bd48d3db), but has sharrow v2.8.2 instead of the previous run's main@8d63a66 (> v2.9.1). Numba was also older using 0.56.4 compared to 0.59.1.

The run results were pretty much exactly the same -- run time was 1080.3 minutes. log.zip

One difference between these current set of runs and the 7.7 hour run above is the server. The 7.7 hour run was done on SANDAG's 1TB RAM, 40 Core machine. These were done on RSG's 500 GB RAM, 24 core machine.

i-am-sijia commented 5 months ago

Sharrow, single process, MTC extended model ran in 10.7 hours on WSP's 512 GB RAM, AMD server. Using everything the latest as of June 26. Memory peak 145 GB in trip destination.

ActivitySim: pr/867@c9d4205 Sharrow: v2.10.0 MTC: extended@a3da8bd

mtc extended single process sharrow

activitysim.log timing_log.csv

dhensle commented 4 months ago

Per the discussion at https://github.com/ActivitySim/sandag-abm3-example/issues/6#issuecomment-2195081910, ran many runs with different NUMBA multithreading (i.e. changing only NUMBA_NUM_THREADS setting):

image

All runs were performed on the same RSG machine with 24 threads.

Some observations:

dhensle commented 4 months ago

Running the same tests as above and on the same machine, but using multiprocessing instead of multi-threading:

image

Comments: