score-p / scorep_binding_python

Allows tracing of python code using Score-P
Other
34 stars 11 forks source link

Hanging with mpi4py.future #106

Open jychoi-hpc opened 4 years ago

jychoi-hpc commented 4 years ago

I am trying to use this scorep python binding with mpi4py.future without success. My code is just hanging without progressing.

Here is my example code (future_hello.py):

from mpi4py import MPI
from mpi4py.futures import MPICommExecutor

def hello(x):
    name = MPI.Get_processor_name()
    me = MPI.COMM_WORLD.Get_rank()
    nproc = MPI.COMM_WORLD.Get_size()
    print (me, nproc, name, 'arg', x)

if __name__ == '__main__':
    nproc = MPI.COMM_WORLD.Get_size()
    with MPICommExecutor(MPI.COMM_WORLD, root=0) as executor:
        if executor is not None:
            for i in range(nproc*2):
                future = executor.submit(hello, i)
            print ("Done.")

I am trying to run as follows:

srun -n 4 python -m scorep --mpp=mpi --thread=pthread future_hello.py 

Without scorep tracing, I expect an output something like:

1 4 nid02401 arg 0
2 4 nid02401 arg 1
3 4 nid02401 arg 2
2 4 nid02401 arg 3
3 4 nid02401 arg 4
1 4 nid02401 arg 5
2 4 nid02401 arg 6
3 4 nid02401 arg 7

I am wondering if this is an expected error or if there is any fix I can try.

I appreciate any advice in advance.

AndreasGocht commented 4 years ago

Hey,

I can reproduce the issue. Which MPI implementation du you use?

Best,

Andreas

jychoi-hpc commented 4 years ago

Thank you for looking at this.

The above example is from Cori, NERSC, which is using MPICH. I also tested with OpenMPI on my local desktop and saw the same problem.

AndreasGocht commented 4 years ago

It took me a while, to dig through the mpi4py, to understand what happens. It looks like a Score-P bug, as I was able to create a pure MPI-C code, with the same behaviour.

I'll raise a ticket with the Score-P developers, but I am not sure, how fast this Issue can be solved.

Basically it seems to relate to the way MPI_Comm_create and then MPI_Intercomm_create is used. The related lines are: https://bitbucket.org/mpi4py/mpi4py/src/6ad13434227f5afcdeb1d733e4eb121d17b50ed1/src/mpi4py/futures/_lib.py#lines-242 https://bitbucket.org/mpi4py/mpi4py/src/6ad13434227f5afcdeb1d733e4eb121d17b50ed1/src/mpi4py/MPI/Comm.pyx#lines-165 https://bitbucket.org/mpi4py/mpi4py/src/6ad13434227f5afcdeb1d733e4eb121d17b50ed1/src/mpi4py/MPI/Comm.pyx#lines-1394

Best,

Andreas

AndreasGocht commented 4 years ago

Turns out, that this is a known Score-P Open Issue:

  • If an application uses MPI inter-communicators, Score-P measurement will hang during the creation of the communicator.

There is nothing I can do from the Score-P Python Binding side. There might be possible to patch the mpi4py code. However, I`ll document my findings here, and leave the ticket Open till a solution is found.

Sorry that I do not have more positive news.

Best,

Andreas

AndreasGocht commented 4 years ago

Example Sourcecode

Below is a basic C-Example which reproduces the Issue. It happens only with three or more processes. The code works without Score-P.

#include <stdio.h>
#include <mpi.h>

int main(int argc, char **argv) 
{ 
    MPI_Group MPI_GROUP_WORLD, group; 
    MPI_Comm groupcomm;
    MPI_Comm intercomm;

    static int list_a[] = {0}; 
    int global_rank = -1;     
    int size_list_a = sizeof(list_a)/sizeof(int); 

    MPI_Init(&argc, &argv);     
    MPI_Comm_rank(MPI_COMM_WORLD, &global_rank); 
    MPI_Comm_group(MPI_COMM_WORLD, &MPI_GROUP_WORLD); 

    if (global_rank == 0)
    {    
        MPI_Group_incl(MPI_GROUP_WORLD, size_list_a, list_a, &group);
    }
    else
    {
        MPI_Group_excl(MPI_GROUP_WORLD, size_list_a, list_a, &group);
    }

    int remote_leader;
    if (global_rank == 0)
    {
        remote_leader = 1;
    }
    else
    {
        remote_leader = 0;
    }

    fprintf(stderr,"RANK %d begin MPI_Comm_create\n", global_rank);
    MPI_Comm_create(MPI_COMM_WORLD, group, &groupcomm); 
    fprintf(stderr,"RANK %d end MPI_Comm_create\n", global_rank);

    fprintf(stderr,"RANK %d begin MPI_Intercomm_create\n", global_rank);
    MPI_Intercomm_create(groupcomm, 0, MPI_COMM_WORLD, remote_leader, 0, &intercomm);
    fprintf(stderr,"RANK %d end MPI_Intercomm_create\n", global_rank);

    int local_rank = -1;
    MPI_Comm_rank(MPI_COMM_WORLD, &local_rank);

    printf("my_global_rank %d, my_local_rank %d \n",global_rank, local_rank);

    MPI_Comm_free(&groupcomm); 
    MPI_Comm_free(&intercomm); 

    MPI_Group_free(&group); 
    MPI_Group_free(&MPI_GROUP_WORLD); 
    MPI_Finalize(); 
} 

Execute

scorep mpicc mpi_comm_create_example.c -o test
mpiexec -n 3 ./test

Program desciption

The program does the following:

Analysis

Looking into the Score-P code (6.0) using ddt, it turns out, that rank 0 stops at the PMPI_Barrier() in MPI_Finalize() (SCOREP_Mpi_Env.c:314), while the other ranks (non-root group) wait for a PMPI_Bcast() in scorep_mpi_comm_create_id() to finish (scorep_mpi_communicator_mgmt.c:267), which never happens.

maximilian-tech commented 1 year ago

MPI_Intercomm tracing is now supported in Score-P 8.0. The code does not hang.

However, Code

from mpi4py import MPI
from mpi4py.futures import MPICommExecutor

def hello(x):
    name = MPI.Get_processor_name()
    me = MPI.COMM_WORLD.Get_rank()
    nproc = MPI.COMM_WORLD.Get_size()
    print (me, nproc, name, 'arg', x)

if __name__ == '__main__':
    nproc = MPI.COMM_WORLD.Get_size()
    with MPICommExecutor(MPI.COMM_WORLD, root=0) as executor:
        if executor is not None:
            futures = []
            for i in range(nproc*2):
                future = executor.submit(hello, i)
                futures.append(future)
            for future in futures:
                future.result()
            print ("Done.")

leads to

$ mpirun -np 3 python  -m scorep --mpp=mpi --thread=pthread future_hello.py
...
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/maxi/.local/lib/python3.10/site-packages/scorep/__main__.py", line 142, in <module>
    scorep_main()
  File "/home/maxi/.local/lib/python3.10/site-packages/scorep/__main__.py", line 119, in scorep_main
    tracer.run(code, globs, globs)
  File "/home/maxi/.local/lib/python3.10/site-packages/scorep/_instrumenters/scorep_instrumenter.py", line 55, in run
    exec(cmd, globals, locals)
  File "future_hello.py", line 19, in <module>
    future.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
_pickle.PicklingError: Can't pickle <function hello at 0x7f5ede003400>: attribute lookup hello on __main__ failed

This only occurs when using Score-P.

NanoNabla commented 1 year ago

This seems to be another issue. Moreover, the code does not hang anymore but raises an exception. I can reproduce the error without using mpi environment.

I will take a look into this issue in #157