RadioAstronomySoftwareGroup / pyuvsim

A ultra-high precision package for simulating radio interferometers in python on compute clusters.
https://pyuvsim.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
43 stars 7 forks source link

No mpirun warning outputs when essential quantities are not shared. #433

Closed nmahesh1412 closed 7 months ago

nmahesh1412 commented 1 year ago

This was noted when passing multiple beams into the run and the beam_dict was not shared among all the mpirun processes. The simulation just stalled, didnt crash and didnt give a warning or error.

jpober commented 8 months ago

This was written in the Slack that seems to suggest a resolution:

I was able to resolve this issue and have now successfully passed 2 beams into a pyuvsim. The main issue was that I didnt share the beam list dict among all the processes in the mpirun. But I will open a github issue to have better warning outputs from pyuvsim instead of just causing it to stall.

bhazelton commented 8 months ago

@nmahesh1412 We'd like to try to get in the right warnings, but we're a little fuzzy on the details. Can you either give us a small example that has this problem so we can add it to our tests or point to where in the code the warning needs to go?

nmahesh1412 commented 8 months ago

So I had passed dict_t to the pyuvsim run as follows without sharing it among all mpirun processes

beam_list = comm.bcast(beam_list, root=0)
catalog_formatted.share(root=0)

start_time = time.time()
output_uv = pyuvsim.uvsim.run_uvdata_uvsim(
    input_uv=uv,
    beam_list=beam_list,
    **beam_dict=dict_t,**
    catalog=catalog_formatted,
    quiet=False,
)

by adding :dict_t = comm.bcast(dict_t, root=0) before calling pyuvsim fixed the issue.

nmahesh1412 commented 8 months ago

And not sharing beam_dict list just caused the code to stall without any warnings

mkolopanis commented 7 months ago

given that these lines are the quintessentially sharing for all sims, is there a harm just moving them into run_uvdata_uvsim?

https://github.com/RadioAstronomySoftwareGroup/pyuvsim/blob/80b9f6ac81205e2e27bc4071c908141d3681069b/pyuvsim/uvsim.py#L958-L961