Open dhensle opened 7 months ago
First ran sharrow compile with the following settings:
households_sample_size: 100
sharrow: test
Run completed in 76 minutes. log_sh_compile.zip
Then ran in production mode
households_sample_size: 0
100 percent samplesharrow: require
Run completed in 7.7 hours with a memory peak at about 163 GB in trip destination.
Followed by multiprocessing
households_sample_size: 0
100 percent samplesharrow: require
multiprocessing: True
num_processors: 24
Run completed in 110 minutes (1.8 hours). log_sh_full_mp.zip
Ran with 100% households and sharrow on, single process.
Run completed in 1090.3 minutes (18.2 hours). This is much longer than the previous time posted above of 7.7 hours.
Current run was performed using PR #867 commit c9d4205.
Timing statements comparing the old run above to this current run show large differences mainly in the destination models:
Will try again with the main branch of ActivitySim instead of PR 867 to see if that makes a difference.
Ran using an older environment that uses the current version of ActivitySim (main@bd48d3db), but has sharrow v2.8.2 instead of the previous run's main@8d63a66 (> v2.9.1). Numba was also older using 0.56.4 compared to 0.59.1.
The run results were pretty much exactly the same -- run time was 1080.3 minutes. log.zip
One difference between these current set of runs and the 7.7 hour run above is the server. The 7.7 hour run was done on SANDAG's 1TB RAM, 40 Core machine. These were done on RSG's 500 GB RAM, 24 core machine.
Sharrow, single process, MTC extended model ran in 10.7 hours on WSP's 512 GB RAM, AMD server. Using everything the latest as of June 26. Memory peak 145 GB in trip destination.
ActivitySim: pr/867@c9d4205 Sharrow: v2.10.0 MTC: extended@a3da8bd
Per the discussion at https://github.com/ActivitySim/sandag-abm3-example/issues/6#issuecomment-2195081910, ran many runs with different NUMBA multithreading (i.e. changing only NUMBA_NUM_THREADS
setting):
All runs were performed on the same RSG machine with 24 threads.
Some observations:
Running the same tests as above and on the same machine, but using multiprocessing instead of multi-threading:
Comments:
This is the issue to report on memory usage and runtime performance when using sharrow...