gwastro / pycbc

Core package to analyze gravitational-wave data, find signals, and study their parameters. This package was used in the first direct detection of gravitational waves (GW150914), and is used in the ongoing analysis of LIGO/Virgo data.
http://pycbc.org
GNU General Public License v3.0
314 stars 351 forks source link

pycbc.events.veto.indices_within_times can blow up RAM usage #125

Closed spxiwh closed 9 years ago

spxiwh commented 9 years ago

The function pycbc.events.veto.indices_within_times can explode in RAM usage in cases where large numbers of triggers are present.

Specifically the issue is that it (in pycbc_coinc_statmap) is testing for trigger IDXes around a set of foreground triggers. It does not ensure that the idxes returned are unique, so in the case that you have a large number of foreground triggers you will get a lot of idxes, but each will show up lots of times, leading to an N**2 blow up of RAM. Somehow this function should include a uniqueness test when generating the list of integer idxes. Locally I am using something like:

for i in range((len(left)+1) // 100000):
    print i, (len(left)+1) // 100000
    start_idx = i*10000
    end_idx = (i+1) *10000
    if end_idx > len(left):
        end_idx = len(left)
    curr_left = left[start_idx:end_idx]
    curr_right = right[start_idx:end_idx]
    curr_stacked = numpy.hstack(numpy.r_[s:e] for s, e in zip(curr_left, curr_right))
    stacked = numpy.unique(numpy.hstack([stacked,curr_stacked]))
return tsort[stacked]
# The original code
# return tsort[numpy.hstack(numpy.r_[s:e] for s, e in zip(left, right))]

But this is hardcoding a value of 100000 that is good for my use-case, but maybe not good always. I don't know a magic numpy way of including uniqueness in the list comprehension.

..... And a gripe. We are doing a lot of optimizing of the inspiral code to make it run faster. But I have never really been hit by run time issues since the hard work to make the chisq really fast and optimize the main FFT computation. We are all managing to complete ER7 runs quickly on different clusters. However, I have hit memory issues on a number of occasions. I think our optimization tests need to also consider and balance RAM usage. For example a clustering algorithm that is really fast, but keeps more triggers than the algorithm used in coh_PTF is probably going to be an issue for RAM usage before the speedup is useful.

ahnitz commented 9 years ago

Ian,

I think you are right on what the issue is here, although I think there is a better solution. In the case the function was original designed for, i.e. the start/end define distinct segment boundaries, you really can't reduce the memory usage much, and the indices are guaranteed to be unique. As such, I think the solution is to coalesce the boundaries given in the start/end vectors before the rest of the function.

ahnitz commented 9 years ago

Ian,

Out of curiosity, any idea off hand how much memory you were seeing used?

spxiwh commented 9 years ago

Hi Alex,

The job failed at 250GB when it hit the ulimit of the node it was running on.

This is an MDC BBH analysis with the SNR threshold to 5.

Cheers Ian

On 1 July 2015 at 22:17, Alex Nitz notifications@github.com wrote:

Ian,

Out of curiosity, any idea off hand how much memory you were seeing used?

— Reply to this email directly or view it on GitHub https://github.com/ligo-cbc/pycbc/issues/125#issuecomment-117813689.

spxiwh commented 9 years ago

I've just found a second issue in here. The hardware injections have not been removed because the veto level being supplied by pycbc_make_coinc_workflow is not correct and so no vetoes are applied.

This would exacerbate this problem as there will be a lot of coincident triggers around hardware injections and all of them loud.

Cheers Ian

On 2 July 2015 at 09:28, Ian Harry iwharry@googlemail.com wrote:

Hi Alex,

The job failed at 250GB when it hit the ulimit of the node it was running on.

This is an MDC BBH analysis with the SNR threshold to 5.

Cheers Ian

On 1 July 2015 at 22:17, Alex Nitz notifications@github.com wrote:

Ian,

Out of curiosity, any idea off hand how much memory you were seeing used?

— Reply to this email directly or view it on GitHub https://github.com/ligo-cbc/pycbc/issues/125#issuecomment-117813689.

ahnitz commented 9 years ago

Wait, are vetoes not being applied a bug somewhere, or just what you happened to do that exhibited this behavior? In any case, you shouldn't be required to veto hardware injections just to have sane memory usage. I'm tested a patch at the moment, and will get back to you.

spxiwh commented 9 years ago

Hey Alex,

Yes, there is a bug in pycbc_make_coinc_workflow causing vetoes not to be applied at the moment. This seems to be a very recently introduced bug. .... As most people are using the HDF workflow this probably only affects me, but I have a patch. I'll test that and then fix this ... although I need to make pycbc_make_coinc_workflow work with the updated segment stuff soon, which should avoid this kind of issue going forward. As this is the "reviewed analysis" we do not to be careful to ensure that is working.

But as looking at results without hardware injections is a genuine use-case, and your suggestion about coalescing the time boundaries seems to be the sensible solution, it would be good to fix the offending function in veto.py.

It would be good to have the optimization team run with large RAM usage workflows like this one so that we really are improving things for O2.

Cheers Ian

On 2 July 2015 at 14:29, Alex Nitz notifications@github.com wrote:

Wait, are vetoes not being applied a bug somewhere, or just what you happened to do that exhibited this behavior? In any case, you shouldn't be required to veto hardware injections just to have sane memory usage. I'm tested a patch at the moment, and will get back to you.

— Reply to this email directly or view it on GitHub https://github.com/ligo-cbc/pycbc/issues/125#issuecomment-118013861.

ahnitz commented 9 years ago

Ok, I've posted the quick fix I mentioned as #132. Let me know if this helps you and if so how much it reduces the memory. I don't have a really great test case on hand, but I could verify that the results ran correctly.

Let's keep this issue open though as I suspect there may also be memory issues in statmap_inj which does not use this function, and I'm not sure how to address at the moment.

spxiwh commented 9 years ago

Closing this as #132 is merged.