microsoft / Microsoft-MPI

Microsoft MPI
MIT License
246 stars 74 forks source link

Calling MPI_Cancel do not guarantee MPI_Wait to return in MPI_THREAD_MULTIPLE mode #18

Closed sdebionne closed 4 years ago

sdebionne commented 5 years ago

The following snippet, previously discussed on SO and validated with MPICH, OpenMPI and Intel MPI is not working properly with MS-MPI 10.0.12498.5.

In a thread, a loop continuously listen to requests with consecutive calls to MPI_Irecv and MPI_Wait. To exit the loop cleanly, MPI_Cancel is called from another thread to cancel the request. According to the standard:

If a communication is marked for cancellation, then a MPI_WAIT call for that communication is guaranteed to return.

#include <mpi.h>

#include <iostream>
#include <future>

using namespace std::literals::chrono_literals;

void async_cancel(MPI_Request *request)
{
    std::this_thread::sleep_for(1s);

    std::cout << "Before MPI_Cancel" << std::endl;

    int res = MPI_Cancel(request);
    if (res != MPI_SUCCESS)
        std::cerr << "MPI_Cancel failed" << std::endl;

    std::cout << "After MPI_Cancel" << std::endl;
}

int main(int argc, char* argv[])
{
    int provided;
    MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);

    if (provided != MPI_THREAD_MULTIPLE)
        std::cout << "MPI_Init_thread could not provide MPI_THREAD_MULTIPLE" << std::endl;

    int rank, numprocs;
    MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    MPI_Request request;
    MPI_Status status;

    int buffer;

    if (rank == 0)
    {
        MPI_Irecv(&buffer, 1, MPI_INT, 1, 123, MPI_COMM_WORLD, &request);

        auto res = std::async(std::launch::async, &async_cancel, &request);

        std::cout << "Before MPI_Wait" << std::endl;

        MPI_Wait(&request, &status);

        std::cout << "After MPI_Wait " << std::endl;
    }
    else
        std::this_thread::sleep_for(2s);

    MPI_Finalize();
    return 0;
}

The expected result is:

Before MPI_Wait
Before MPI_Cancel
After MPI_Cancel
After MPI_Wait

With MS-MPI, MPI_Wait() does not return.