IN-CORE / pyincore

pyIncore is a component of IN-CORE. It is a python package consisting of two primary components: 1) a set of service classes to interact with the IN-CORE web services, and 2) IN-CORE analyses . The pyIncore allows users to apply various hazards to infrastructure in selected areas, propagating the effect of physical infrastructure damage and loss of functionality to social and economic impacts.
Mozilla Public License 2.0
25 stars 7 forks source link

493 investigate in building damage performance issue #494

Closed longshuicy closed 8 months ago

longshuicy commented 9 months ago

I realize we are looping through all the buildings redundantly. Taking out one layer of the looping when matching inventory with fragilities. Still not very fast though

Some benchmarks (before/after):

Memphis Earthquake damage run time: 4.497807025909424
 Memphis Earthquake damage run time: 3.479637861251831


Seaside Tsunami damage run time: 12.431647062301636
 Seaside Tsunami damage run time: 11.918556928634644

Galveston Hurricane damage run time: 85.12275385856628
 Galveston Hurricane damage run time: 83.93139791488647

Lumberton Flood damage run time: 38.145187854766846 Lumberton Flood damage run time: 38.67339634895325

Joplin Tornado damage run time: 21.523848295211792 Joplin Tornado damage run time: 21.288371086120605

SLC Earthquake damage run time: 561.6372690200806 SLC Earthquake damage run time: 498.26733016967773

longshuicy commented 9 months ago

Other ways to improve include using Map Reduce/ Geopandas also have similar concept (lambda apply).

navarroc commented 9 months ago

@longshuicy I don't see the speedup that you indicated that you saw with the changes to this branch. When I run SLC with the main branch and 4 cpus on my PC I got a runtime of about 6 minutes. Using the same input parameters with your branch here, I got a runtime of about 9 minutes. I ran both setups (main vs your branch) multiple times to make sure it wasn't a once off skewing the results. I consistently get faster speeds with the code that is in main. I'm not sure how to explain the difference.

longshuicy commented 9 months ago

@longshuicy I don't see the speedup that you indicated that you saw with the changes to this branch. When I run SLC with the main branch and 4 cpus on my PC I got a runtime of about 6 minutes. Using the same input parameters with your branch here, I got a runtime of about 9 minutes. I ran both setups (main vs your branch) multiple times to make sure it wasn't a once off skewing the results. I consistently get faster speeds with the code that is in main. I'm not sure how to explain the difference.

That's before I further turned this branch into using geopandas, which actually slow down everything. If you revert one commit, you will see the similar performance as I wrote down.

image
longshuicy commented 8 months ago

Closing this PR since we shall optimize in another way.