gwastro / pycbc

Core package to analyze gravitational-wave data, find signals, and study their parameters. This package was used in the first direct detection of gravitational waves (GW150914), and is used in the ongoing analysis of LIGO/Virgo data.
http://pycbc.org
GNU General Public License v3.0
307 stars 344 forks source link

add multiprocessing to pycbc_brute_bank #4803

Closed yi-fan-wang closed 1 week ago

yi-fan-wang commented 2 weeks ago

Standard information about the request

This is a: efficiency update

This change affects: the pycbc_brute_bank

This change changes: this only add a multiprocessing feature to the existing pycbc_brute_bank and wouldn't change any other exisiting results

This change would simply follow the unit tests designed for pycbc_brute_bank (if there is any)

This change will not break current functionality, not require additional dependencies, not require a new release

Motivation

When a waveform generation is slow, the bottleneck for pycbc_brute_bank is the waveform generation speed. So I parallelize the waveform generation. This PR doesn't affect any match computation.

Contents

I added a parallelization for waveform generation in pycbc_brute_bank

Links to any issues or associated PRs

None

Testing performed

Tested with "--num_cores = 96" and it works fine.

Additional notes

ACKNOWLEGMENT: with significant techinical support from @raffienficiaud

WuShichao commented 2 weeks ago

Thanks for the update. Is it the same as generating subbanks at different parameter ranges simultaneously?

ahnitz commented 2 weeks ago

No, this just parallelizes the waveform generation. Nothing else.

yi-fan-wang commented 2 weeks ago

@WuShichao No I parallelize the gravitational waveform generation in a single subbank region (as constrast to the sequential waveform generation before this PR)

WuShichao commented 2 weeks ago

I see. Just check the code. Will the speed grow linearly with the number of cores?

yi-fan-wang commented 2 weeks ago

@WuShichao Yes if the speed of each waveform is even

spxiwh commented 2 weeks ago

@yi-fan-wang @ahnitz Out of curiosity, does this start to become limited by communication between the processes? (I think that you're passing around frequency series). While perhaps not worth the extra effort, would multi-threading be more efficient here (I could see a Cython/C module, bypassing the GIL, could make multi-threading work here).

ahnitz commented 2 weeks ago

@spxiwh I wouldn't think so, but this only helps in the case where the waveform generation is very slow at the moment anyway, since the matches aren't parallelized here.

ahnitz commented 2 weeks ago

@yi-fan-wang Review is contingent on you kicking the CI and everything passing there.