PaulHancock / Aegean

The Aegean source finding program and associated tools
http://aegeantools.rtfd.io/
Other
47 stars 14 forks source link

BANE never finishes #213

Open ywjlee opened 6 months ago

ywjlee commented 6 months ago

I am trying to run BANE on a fits file (specifically, an ASKAP RACS field image) but it never finishes. I ran it on a MeerKAT image and it worked perfectly fine with the same number of cores. I've attached a screenshot for your reference. Do you have any suggestions in solving this?

Screenshot 2024-05-09 at 6 36 26 PM
tikk3r commented 3 weeks ago

Same for me. I tried running it on a LOFAR image (20,000 x 20,000 pixels in size) and once the grid becomes sufficiently dense it just seems to hang forever unless I comment out these bits in AegeanTools/BANE.py:

    i = barrier.wait()
    if i == 0:
        barrier.reset()

I don't know if that affects anything though. For coarse grids there doesn't seem to be a difference between the generated rms imgae, but I can't compare denser grids because of the apparent hanging.

tjgalvin commented 1 week ago

Its probably related to what is described in here: https://github.com/PaulHancock/Aegean/issues/198

In short that is a small bug around calculating how many stripes are required. In some instances this can be higher than the number of cores available, and this causes the pool to deadlock.

So long as you set the number of cores to be higher than the number of stripes you should be ok. Both of these are options available on the command line.