open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.14k stars 859 forks source link

osc/rdma: MPI_Get from self fails with dynamic windows #6394

Closed ggouaillardet closed 5 years ago

ggouaillardet commented 5 years ago

This was initially reported by Bart Janssens at https://www.mail-archive.com/users@lists.open-mpi.org//msg33035.html

The minimal program below crashes when ran with more than one MPI task when osc/rdma is used

#include <stdio.h>

#include <mpi.h>

int main(int argc, char *argv[]) {
    int rank;
    int rbuf, lbuf; 
    MPI_Win win;

    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    rbuf = rank + 1;

    MPI_Win_create_dynamic(MPI_INFO_NULL, MPI_COMM_WORLD, &win);
    MPI_Win_attach(win, &rbuf, sizeof(int));

    MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank, 0, win);
    MPI_Get(&lbuf, 1, MPI_INT, rank, (MPI_Aint)&rbuf, 1, MPI_INT, win);
    MPI_Win_unlock(rank, win);

    if ((1+rank) != lbuf) {
        fprintf(stderr, "Got %d but expected %d\n", lbuf, 1+rank);
        MPI_Abort(MPI_COMM_WORLD, 1);
    }

    MPI_Win_free(&win);
    MPI_Finalize();
    return 0;
}
ggouaillardet commented 5 years ago

Here is a possible fix. The right fix might be to correctly set the peer flags so ompi_osc_rdma_peer_local_base(peer) always returns true when peer is the current MPI task.

diff --git a/ompi/mca/osc/rdma/osc_rdma_comm.c b/ompi/mca/osc/rdma/osc_rdma_comm.c
index 1c16676..011b7c2 100644
--- a/ompi/mca/osc/rdma/osc_rdma_comm.c
+++ b/ompi/mca/osc/rdma/osc_rdma_comm.c
@@ -3,8 +3,8 @@
  * Copyright (c) 2014-2018 Los Alamos National Security, LLC.  All rights
  *                         reserved.
  * Copyright (c) 2016-2018 Intel, Inc. All rights reserved.
- * Copyright (c) 2017      Research Organization for Information Science
- *                         and Technology (RIST). All rights reserved.
+ * Copyright (c) 2017-2019 Research Organization for Information Science
+ *                         and Technology (RIST).  All rights reserved.
  * Copyright (c) 2017      IBM Corporation. All rights reserved.
  * $COPYRIGHT$
  *
@@ -795,7 +795,7 @@ static inline int ompi_osc_rdma_get_w_req (ompi_osc_rdma_sync_t *sync, void *ori
     }

     /* optimize self/local communication */
-    if (ompi_osc_rdma_peer_local_base (peer)) {
+    if (ompi_osc_rdma_peer_local_base (peer) || (ompi_group_rank(module->win->w_group) == peer->rank)) {
         return ompi_osc_rdma_copy_local ((void *) (intptr_t) source_address, source_count, source_datatype,
                                          origin_addr, origin_count, origin_datatype, request);
     }
ggouaillardet commented 5 years ago

Here is an other possible fix

diff --git a/ompi/mca/osc/rdma/osc_rdma_component.c b/ompi/mca/osc/rdma/osc_rdma_component.c
index d379350..5afac94 100644
--- a/ompi/mca/osc/rdma/osc_rdma_component.c
+++ b/ompi/mca/osc/rdma/osc_rdma_component.c
@@ -19,6 +19,8 @@
  * Copyright (c) 2016-2017 IBM Corporation. All rights reserved.
  * Copyright (c) 2018      Cisco Systems, Inc.  All rights reserved
  * Copyright (c) 2018      Amazon.com, Inc. or its affiliates.  All Rights reserved.
+ * Copyright (c) 2019      Research Organization for Information Science
+ *                         and Technology (RIST).  All rights reserved.
  * $COPYRIGHT$
  *
  * Additional copyrights may follow
@@ -722,7 +724,13 @@ static int allocate_state_shared (ompi_osc_rdma_module_t *module, void **base, s

             ompi_osc_module_add_peer (module, peer);

-            if (MPI_WIN_FLAVOR_DYNAMIC == module->flavor || 0 == temp[i].size) {
+            if (MPI_WIN_FLAVOR_DYNAMIC == module->flavor) {
+                if (module->use_cpu_atomics && peer_rank == my_rank) {
+                    peer->flags |= OMPI_OSC_RDMA_PEER_LOCAL_BASE;
+                }
+                /* nothing more to do */
+                continue;
+            } else if (0 == temp[i].size) {
                 /* nothing more to do */
                 continue;
             }
hjelmn commented 5 years ago

@ggouaillardet The second fix is the correct one. Another case that wasn't well tested. Need an MTT test for get from self in the dynamic case.

ggouaillardet commented 5 years ago

thanks, I will PR for master and then v4.0.x and add the test into the ibm test suite.

hppritcha commented 5 years ago

the fix has been merged into master and v4.0.x so closing.