cornelisnetworks / opa-psm2

Other
37 stars 29 forks source link

Uninitialized AM message #24

Closed RaymondMichael closed 6 years ago

RaymondMichael commented 6 years ago

In ips_proto_am() in the IPS_MSG_ORDER_FUTURE_RECV case, the code gets a ips_am_message from an mpool. It then references msg->proto_am->proto->mq. The problem is that msg is uninitialized and msg->proto_am points to NULL.

https://github.com/intel/opa-psm2/blob/5fabd0e699a920e74333f789923fd1c02bb7c629/ptl_ips/ips_proto_am.c#L566

rwmcguir commented 6 years ago

We are looking into this, can you provide any other information about how this was found? i.e. is there a specific workload that was run (perhaps a standard benchmark?) with specific parameters?

stoffelj commented 6 years ago

Not as direct forward as sharing a specific benchmark with you since HPE MPT is being used as the MPI. We are running 2 OPA RAILS when failure occurs. I'll investigate some to see what more useful info I can provide.

stoffelj commented 6 years ago

Simplified program. See call to do_gets to see size when failure starts. Note fails with 2 OPA_RAILS not with 1.

include

include

include

include

include

include

define BUFSZ (1024 * 1024)

static int rank, size; static int * data;

static void do_gets(size_t num, int loops) { MPI_Win win; int i, * loc;

    loc = (int*)malloc(num * loops * sizeof(int));
    assert(loc);

    MPI_Win_create(data, BUFSZ * sizeof(int), 0, MPI_INFO_NULL,
                     MPI_COMM_WORLD, &win);

    MPI_Win_fence(0, win);

    for (i = 0; i < loops; i++) {
         if (0 == rank) {
                MPI_Get(&loc[i * num], num, MPI_INT, (rank + i) % size, 1,
                     num, MPI_INT, win);
         }
    }

    MPI_Win_fence(0, win);

    free(loc);

}

int main(int argc, char ** argv) { MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size);

    data = (int*)calloc(BUFSZ, sizeof(int));
    assert(data);
    /* We want the loop counts high enough to exhaust the send headers */
    if (0 == rank) {printf("Doing \n");}

// do_gets(2031, 4); // passes // do_gets(2032, 4); // Passes do_gets(2033, 4); // fails

    MPI_Finalize();
    return 0;

}

jdinan commented 6 years ago

The disp_unit argument to MPI_Win_create is invalid. The MPI standard requires this to be a positive integer (1 or greater, where 1 means byte displacements). The disp argument to MPI_Get (1) is also a little odd (no pun intended). It almost looks like you might want to exchange these, window disp_unit of 1 (byte displacements) and get disp of 0 (start at the beginning of the buffer).

stoffelj commented 6 years ago

Same failure with the following code. I made corrections you were concerned about. The following code shows a change of 1 int can cause the failure. Again only fails with RAILS=2.

include

include

include

include

include

include

define BUFSZ (1024 * 1024)

static int rank, size; static int * data;

int main(int argc, char * argv) { MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Win win; int loc; int num = 2033;

    data = (int*)calloc(BUFSZ, sizeof(int));
    assert(data);

    if (0 == rank) {printf("Doing \n");}

    loc = (int*)malloc(4096 * 4 * sizeof(int));
    assert(loc);

    printf("data:%p loc:%p \n", data, loc);

    MPI_Win_create(data, BUFSZ * sizeof(int), sizeof(int), MPI_INFO_NULL,
                     MPI_COMM_WORLD, &win);

    MPI_Win_fence(0, win);

         if (0 == rank) {

//Passes MPI_Get(&loc[0 num], num-1, MPI_INT, 1, 0, num-1, MPI_INT, win); / Fails replace the above get with this one to get failure. MPI_Get(&loc[0 num], num, MPI_INT, 1, 0, num, MPI_INT, win); / MPI_Get(&loc[1 num], num, MPI_INT, 1, 1num, num, MPI_INT, win); MPI_Get(&loc[2 num], 1, MPI_INT, 1, 2num, 1, MPI_INT, win);

         }

    MPI_Win_fence(0, win);

    free(loc);

    MPI_Finalize();
    return 0;

}

stoffelj commented 6 years ago

Would a PSM trace of a failing and passing run be helpful in determining what is happening here?

stoffelj commented 6 years ago

Any feedback related to Michael's entry on March 8? Do you need anything else from us?

jdinan commented 6 years ago

I'm just a bystander, perhaps @rwmcguir can help?

rwmcguir commented 6 years ago

@stoffelj I sent you an email, we can work this offline? I think there might be some details that are larger than github can share (like MPI runtime binaries).

aravindksg commented 6 years ago

Issue fixed with commit 8a12e84