Closed kejohnso closed 1 year ago
Here is how to profile a disaggregation calculation.
oq run job.ini -p calculation_mode=classical
and write down the calculation ID (say it is 1234).oq run job.ini -p epsilon_star=true --hc 1234 -c0 -s100 > true.txt
oq run job.ini -p epsilon_star=false --hc 1234 -c0 -s100 > false.txt
true.txt
and false.txt
By doing that I see that all the time is spent in the scipy function _truncnorm_sf_scalar
. It could be that _truncnorm_sf_scalar
has special optimizations for macOS. Perhaps we
could rewrite _truncnorm_sf_scalar
by using numba (we have some function that looks similar called truncnorm_sf
in https://github.com/gem/oq-engine/blob/master/openquake/hazardlib/stats.py#L38), ~but frankly it is a lot of work~. I would do nothing for the moment, but I am pretty sure I can do better than scipy (I looked at the code: that part is pure Python and not using any vectorization at all).
For disaggregation of single sites by MDE, the calculation is sometimes significantly slower when using
epsilon_star=True
. The table below shows the time required to run the same job on three different computers and the oq version, with or withoutepsilon_star
. When not all sources were used, the percentage of sources is controlled usingOQ_SOURCES_SAMPLE=X oq engine --run ...
. The reported times come from what is printed in the console, but we confirm that the difference occurs in disaggregation from the reports.When the job is reduced, the overall time difference becomes smaller but is still distinct in the
total compute_disagg
. For example, in the smallest test case:I will share job files separately.