Closed abouteiller closed 6 years ago
Original comment by Aurelien Bouteiller (Bitbucket: abouteiller, GitHub: abouteiller).
demoted to minor. As we now only revoke/cancel automatically PML requests, this bug is of lower importance.
Original comment by Aurelien Bouteiller (Bitbucket: abouteiller, GitHub: abouteiller).
It is invalid to cancel a coll/comm request (by MPI spec), and we stopped doing it, so we are fine.
Original report by Aurelien Bouteiller (Bitbucket: abouteiller, GitHub: abouteiller).
That's an upstream defect that affects only us:
coll_comm requests are placeholders for non-blocking collectives performed during next-cid and friends.
Cancelling that request cancels in turn each of the components of the non-blocking collective request (i.e. a form of generalized request). The top level cancel takes the comm_request_lock, so do all the component requests in turn, which deadlock/abort.