Application may call MPI_Waitany after it calls MPI_Irecv. Waitany depends the request ID to decide which message is complete. This change creates a virtual request when Irecv is serviced from buffered packets.
The situation that causes Waitany problem is,
[1] Nimrod post several irecv and get the requests of irecv
[2] At checkpoint, MANA drains irecv.
[3] MANA restart replays irecv. The Irecv sets request to
MPI_REQUEST_NULL when it consumes the buffer saved in the
above checkpoint step.
[4] Nimrod calls waitany. At this moment all requests are NULL
because of step #3. It returns MPI_UNDEFINED as the index.
This is wrong. It should return the index of complete Irecv.
Application may call MPI_Waitany after it calls MPI_Irecv. Waitany depends the request ID to decide which message is complete. This change creates a virtual request when Irecv is serviced from buffered packets.
The situation that causes Waitany problem is, [1] Nimrod post several irecv and get the requests of irecv [2] At checkpoint, MANA drains irecv. [3] MANA restart replays irecv. The Irecv sets request to MPI_REQUEST_NULL when it consumes the buffer saved in the above checkpoint step. [4] Nimrod calls waitany. At this moment all requests are NULL because of step #3. It returns MPI_UNDEFINED as the index. This is wrong. It should return the index of complete Irecv.