ulfm-devel / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
0 stars 0 forks source link

Failure Propagation and detection works only on comm_world #7

Open abouteiller opened 8 years ago

abouteiller commented 8 years ago

Original report by Aurelien Bouteiller (Bitbucket: abouteiller, GitHub: abouteiller).


Ideally, it should operate on the connected universe, but this is hard, because not all procs have the same vue of what the total universe is (the processes in the parent comm of spawn know about the "bulge" formed by the intercomm, but nobody else does).

An alternative solution is to work at the PMIX level.

abouteiller commented 4 years ago

Original comment by Aurelien Bouteiller (Bitbucket: abouteiller, GitHub: abouteiller).


pr #13 brings compatibility with external FD in PMIx