microsoft / Microsoft-MPI

Microsoft MPI
MIT License
246 stars 74 forks source link

Problem with MPI_File_set_view #17

Open MrSmile opened 5 years ago

MrSmile commented 5 years ago

I have unexpected results with fairly complex usage of MPI_File_set_view. With some experimenting I managed to minimize test case to this:

#include <mpi.h>
#include <cstdio>

inline void check_error(int err)
{
    if(!err)return;  int len;
    char buf[MPI_MAX_ERROR_STRING];
    MPI_Error_string(err, buf, &len);
    std::printf("MPI error: %.*s\n", len, buf);
}

int main(int n, char **arg)
{
    check_error(MPI_Init(&n, &arg));

    int current_node, node_count;
    check_error(MPI_Comm_rank(MPI_COMM_WORLD, &current_node));
    check_error(MPI_Comm_size(MPI_COMM_WORLD, &node_count));

    MPI_Datatype type;
    int sizes[] = {1, 1, 1};
    MPI_Aint offsets[] = {0, node_count + current_node, current_node};
    MPI_Datatype types[] = {MPI_LB, MPI_BYTE, MPI_BYTE};
    check_error(MPI_Type_create_struct(3, sizes, offsets, types, &type));
    check_error(MPI_Type_commit(&type));

    MPI_File file;
    check_error(MPI_File_open(MPI_COMM_WORLD, "test.txt",
        MPI_MODE_WRONLY | MPI_MODE_CREATE, MPI_INFO_NULL, &file));
    check_error(MPI_File_set_size(file, 0));

    MPI_Status status;  int pos = 4;
    if(!current_node)check_error(MPI_File_write(file, "::::", pos, MPI_BYTE, &status));
    check_error(MPI_File_set_view(file, pos, MPI_BYTE, type, "native", MPI_INFO_NULL));

    char buf[2];
    buf[0] = 'a' + current_node;  buf[1] = 'A' + current_node;
    check_error(MPI_File_write(file, buf, 2, MPI_BYTE, &status));

    check_error(MPI_File_close(&file));
    check_error(MPI_Type_free(&type));
    check_error(MPI_Finalize());
}

I expect that after running at 4 nodes test.txt would contain ::::ABCDabcd and with another MPI implementation it's indeed the case. But under MS-MPI I have

MPI error: Other I/O error, error stack:
Other I/O error Invalid access to memory location.

And with 1 node I even have hang up.

When I use direct element order instead of reverse (swap last two elements of array offsets) I have slightly better result of abcdABCD but first characters get wiped out.

AnnaDaly commented 5 years ago

Just out of curiosity, what other MPI implementation did you try? I can confirm that msmpi throws an error, but Intel MPI on the same code hangs for me.

MrSmile commented 5 years ago

I've tried it under different OSes and it works fine with OpenMPI under Linux (ubuntu 18.04). Looks like most MPI implementations can't handle this.