RadioAstronomySoftwareGroup / pyuvsim

A ultra-high precision package for simulating radio interferometers in python on compute clusters.
https://pyuvsim.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
42 stars 6 forks source link

Fix MPI tests breaking on some builds #426

Closed bhazelton closed 1 year ago

bhazelton commented 1 year ago

Description

The main problem we were seeing in some CI builds (and on my machine) seems to be related to not properly freeing shared memory resources. This adds code to keep track of the shared memory resources so they can be properly freed at the end of the simulation (or if there are errors raised).

Other minor fixes:

Motivation and Context

Some CI builds were breaking with MPI related errors.

Types of changes

Checklist:

For all pull requests:

Bug fix checklist:

Build or continuous integration change checklist:

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.04 :tada:

Comparison is base (83fa24a) 99.45% compared to head (584a7a8) 99.49%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #426 +/- ## ========================================== + Coverage 99.45% 99.49% +0.04% ========================================== Files 12 12 Lines 2201 2198 -3 ========================================== - Hits 2189 2187 -2 + Misses 12 11 -1 ``` | [Impacted Files](https://app.codecov.io/gh/RadioAstronomySoftwareGroup/pyuvsim/pull/426?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=RadioAstronomySoftwareGroup) | Coverage Δ | | |---|---|---| | [pyuvsim/mpi.py](https://app.codecov.io/gh/RadioAstronomySoftwareGroup/pyuvsim/pull/426?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=RadioAstronomySoftwareGroup#diff-cHl1dnNpbS9tcGkucHk=) | `100.00% <100.00%> (ø)` | | | [pyuvsim/uvsim.py](https://app.codecov.io/gh/RadioAstronomySoftwareGroup/pyuvsim/pull/426?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=RadioAstronomySoftwareGroup#diff-cHl1dnNpbS91dnNpbS5weQ==) | `100.00% <100.00%> (ø)` | | ... and [1 file with indirect coverage changes](https://app.codecov.io/gh/RadioAstronomySoftwareGroup/pyuvsim/pull/426/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=RadioAstronomySoftwareGroup)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

steven-murray commented 1 year ago

@bhazelton overall the code looks good, however I'd suggest something similar to @aelanman in that I don't think the code that does the MPI sharing should be attached to any particular class that is being shared. To me, separation of concerns (or single responsibility principle) would dictate making a class/function in the mpi module that is able to share an arbitrary object, and also maintains an internal "global" window list that can be released. Having all the MPI stuff inside the SkyModel and BeamList classes makes them less usable by other code.

bhazelton commented 1 year ago

I'm realizing that I don't understand how MPI knows which shared array is which. But it seems like it's tied to the objects, so I'm not sure if the sharing can be moved out of the objects without breaking things. The freeing can be moved into a function as suggested in @aelanman's point 1. But I think you're asking for something more @steven-murray , I think you want the sharing moved out of the object as well?

steven-murray commented 1 year ago

@bhazelton yeah, I'd probably have to look at the details more closely to understand what's really possible. In my mind the ideal kind of situation would be an api like the following:

from pyuvsim import mpi, SkyModel, BeamList

sky = SkyModel()
beams = BeamList()

mpi.share(sky, beams)
do_stuff(sky, beams)
mpi.free(sky, beams)

It might not be possible to do something like that -- an alternative might be to provide a key:

from pyuvsim import mpi, SkyModel, BeamList

sky = SkyModel()
beams = BeamList()

mpi.share('sky_key', sky)
mpi.share('beam_key', beams)

do_stuff(sky, beams)
mpi.free('sky_key', 'beam_key')

One could imagine having an mpi.free_all() function as well.

bhazelton commented 1 year ago

@aelanman @mkolopanis I think you two are best placed to tell us what is most likely to work here. I really don't understand how the different shared arrays are being kept track of in MPI...

aelanman commented 1 year ago

Maybe this is bad form, but could you have a module-level list that gets appended to every time a new window is created, then register a cleanup function at exit? e.g., in mpi.py:

_OPEN_WINDOWS = []

def shared_mem_bcast():
. . . 
    win = MPI.Win.Allocate_shared(nbytes, itemsize, comm=node_comm)
    buf, itemsize = win.Shared_query(0)
    sh_arr = np.ndarray(buffer=buf, dtype=dtype, shape=shape)
    _OPEN_WINDOWS.append(win)
. . .

def _window_cleaning():
   for win in _OPEN_WINDOWS:
        win.Free()

atexit.register(_window_cleaning)

This leaves the responsibility of tracking open MPI windows to the mpi module, so the BeamList and SkyModelData classes don't need any modification.

steven-murray commented 1 year ago

@aelanman's suggestion is approximately what I was thinking of. The only catch would be if different lists of windows are required to be free'd at different times (eg. if you had two simulations running or something like that?). Then you'd need a way to keep track of which sub-list belongs to which simulation. If all you need to do is make sure everything is freed before exiting Python (and don't care about exactly when, so long as it's after the computation is done), then @aelanman's suggestion works perfectly.

mkolopanis commented 1 year ago

Yeah this would definitely be another way to do it too. I don't suspect we have too large a user base that runs simulations over and over in like a jupyter notebook or something like that. I'm personally more prone to cleaning up after yourself when you're done using the resources but either way should work fine.

bhazelton commented 1 year ago

Ok, we fixed this as a group on the telecon to move the shared window tracking to the mpi module and free it from there with an atexit cleanup call.