gem / oq-engine

OpenQuake Engine: a software for Seismic Hazard and Risk Analysis
https://github.com/gem/oq-engine/#openquake-engine
GNU Affero General Public License v3.0
377 stars 273 forks source link

sometimes disaggregation is much slower if epsilon_star = True #8346

Closed kejohnso closed 1 year ago

kejohnso commented 1 year ago

For disaggregation of single sites by MDE, the calculation is sometimes significantly slower when using epsilon_star=True. The table below shows the time required to run the same job on three different computers and the oq version, with or without epsilon_star. When not all sources were used, the percentage of sources is controlled using OQ_SOURCES_SAMPLE=X oq engine --run .... The reported times come from what is printed in the console, but we confirm that the difference occurs in disaggregation from the reports.

image

When the job is reduced, the overall time difference becomes smaller but is still distinct in the total compute_disagg. For example, in the smallest test case:

**epsilon_star = True**
+------------------------------+-----------+-----------+--------+
| calc_5553, maxmem=0.7 GB     | time_sec  | memory_mb | counts |
+------------------------------+-----------+-----------+--------+
| DisaggregationCalculator.run | 37.7      | 336.9     | 1      |
+------------------------------+-----------+-----------+--------+
| ClassicalCalculator.run      | 29.9      | 356.9     | 1      |
+------------------------------+-----------+-----------+--------+
...
+------------------------------+-----------+-----------+--------+
| total compute_disagg         | 9.47083   | 1.38086   | 15     |

**without epsilon_star**
+------------------------------+-----------+-----------+--------+
| calc_5552, maxmem=0.6 GB     | time_sec  | memory_mb | counts |
+------------------------------+-----------+-----------+--------+
| DisaggregationCalculator.run | 32.5      | 285.8     | 1      |
+------------------------------+-----------+-----------+--------+
| ClassicalCalculator.run      | 31.2      | 305.5     | 1      |
+------------------------------+-----------+-----------+--------+
... 
+------------------------------+-----------+-----------+--------+
| total compute_disagg         | 1.37318   | 1.20703   | 15     |

I will share job files separately.

micheles commented 1 year ago

Here is how to profile a disaggregation calculation.

  1. Run oq run job.ini -p calculation_mode=classical and write down the calculation ID (say it is 1234).
  2. Run oq run job.ini -p epsilon_star=true --hc 1234 -c0 -s100 > true.txt
  3. Run oq run job.ini -p epsilon_star=false --hc 1234 -c0 -s100 > false.txt
  4. Compare the profiler information between true.txt and false.txt

By doing that I see that all the time is spent in the scipy function _truncnorm_sf_scalar. It could be that _truncnorm_sf_scalar has special optimizations for macOS. Perhaps we could rewrite _truncnorm_sf_scalar by using numba (we have some function that looks similar called truncnorm_sf in https://github.com/gem/oq-engine/blob/master/openquake/hazardlib/stats.py#L38), ~but frankly it is a lot of work~. I would do nothing for the moment, but I am pretty sure I can do better than scipy (I looked at the code: that part is pure Python and not using any vectorization at all).