OpenMDAO / testflo

A simple python testing framework that can run unit tests under MPI (or not).
Other
3 stars 7 forks source link

Error handling with mpi4py #27

Closed ewu63 closed 5 years ago

ewu63 commented 5 years ago

Does testflo support MPI testing where the assertion is only performed on one processor? Consider the following code:

from mpi4py import MPI
import unittest
class MyMPI_TestCase(unittest.TestCase):
    N_PROCS = 4
    def setUp(self):
        self.comm = MPI.COMM_WORLD
        self.rank = self.comm.rank
    def my_assert(self,statement):
        if self.rank == 0:
            if not statement:
                raise AssertionError()
    def test_foo(self):
        self.my_assert(1 == 2)
        # some parallel analysis that requires all procs
        self.comm.barrier() # this will hang
if __name__ == '__main__':
    unittest.main()

which will hang since the other processor is not aware of the error and will instead wait at the barrier indefinitely.

I understand this is related to the broader question of error handling under mpi4py which is discussed here, but the proposed solution cannot be applied in this case. Some further discussions can be found here but I couldn't get it to work, and I'm not really sure how testflo works with MPI in general. Is there a way to handle this within testflo in a consistent manner?

naylor-b commented 5 years ago

We created a context manager called multi_proc_exception_check that you can find in openmdao.utils.mpi that we use to deal with some situations where there is an exception raised in only some subset of the mpi procs. However, this solution only works in cases where the code wrapped by the context manager doesn't contain any collective MPI calls that could happen after an exeption was raised. In your example, it should work if you wrap the context manager around your call to self.my_assert.

ewu63 commented 5 years ago

Thanks for that, it seems to work for simpler cases for sure. I'm getting some MPI hangs in the main tests but that may be an unrelated issue, closing.