eqcorrscan / EQcorrscan

Earthquake detection and analysis in Python.
https://eqcorrscan.readthedocs.io/en/latest/
Other
166 stars 86 forks source link

Speedup 01 Quicker grouping of templates #524

Closed flixha closed 1 year ago

flixha commented 1 year ago

What does this PR do?

Gives 50x speed up for grouping templates.

Why was it initiated? Any relevant Issues?

Grouping many templates was slow because the function was using a double loop for template-parameter comparison (O^2). So grouping took ~20 s for ~2000 templates. Sped up single loop to <0.5 s.

The new, quick function now circumvents the repeated calls to template.same_processing that contribute to the slowdown. So if we change what template.same_processing does (i.e., which dict-entries it compares), then we'd also need to change it here. Left in the old function for testing for now.

This PR contributes to the summary issue in https://github.com/eqcorrscan/EQcorrscan/issues/522

PR Checklist

flixha commented 1 year ago

NCEDC - FDSN not answering properly in tests..

calum-chamberlain commented 1 year ago

Looks good to me @flixha - point noted on the template.same_processing. How about we make a template._processing_parameters property and use that instead of the:

if key in ["name", "st", "prepick", "event", "template_info"]:
    continue
else:
    actually compare

if logic in template.same_processing? Then your processing_tuples could just access those defined params. Any changes to the params compared in template.same_processing would then have to be made to the property and hence to your func?

flixha commented 1 year ago

template._processing_parameters property and use that instead of the:

Very good suggestion. Now I defined the _processing_parameters as function / property for a Template, which helps to simplify the code at three places so that it's simpler to update any changes to what we consider processing parameters.

let me know if this is not as you were suggesting..