NOAA-GFDL / NDSL

NOAA NASA Domain Specific Language middleware layer
6 stars 8 forks source link

Halo Update Fails Quietly on Buffer Size Mismatch #15

Open oelbert opened 7 months ago

oelbert commented 7 months ago

Is your feature request related to a problem? Please describe. If there is a buffer mismatch in a halo update it will crash, but not describe the cause in the stack trace. For example, running a scalar halo update on a Quantity with dimensions X_DIM, Y_INTERFACE_DIM crashes but the stack trace goes into mpi4y:

`File /usr/local/lib/python3.8/site-packages/ndsl/comm/communicator.py:362, in Communicator.halo_update(self, quantity, n_points) 359 quantities = quantity 361 halo_updater = self.start_halo_update(quantities, n_points) --> 362 halo_updater.wait()

File /usr/local/lib/python3.8/site-packages/ndsl/halo/updater.py:284, in HaloUpdater.wait(self) 282 send_req.wait() 283 for recv_req in self._recv_requests: --> 284 recv_req.wait() 286 # Unpack buffers (updated by MPI with neighbouring halos) 287 # to proper quantities 288 with self._timer.clock("unpack"):

File mpi4py/MPI/Request.pyx:266, in mpi4py.MPI.Request.wait()

File mpi4py/MPI/msgpickle.pxi:450, in mpi4py.MPI.PyMPI_wait()`

Describe the solution you'd like It would be nice to have something in the halo update code that raises this immediately, at least when the x and y dimensions are mismatched on a scalar halo update