baidu-research / tensorflow-allreduce

Apache License 2.0
375 stars 78 forks source link

MPI_Init fails with some openMPI variants #4

Open carlos-pinto-coelho-microsoft opened 7 years ago

carlos-pinto-coelho-microsoft commented 7 years ago

Getting the following error: [phlrr4019:30856] mca: base: component_find: unable to open /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_shmem_sysv: /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_shmem_sysv.so: undefined symbol: opal_show_help (ignored) [phlrr4019:30856] mca: base: component_find: unable to open /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_shmem_posix: /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_shmem_posix.so: undefined symbol: opal_shmem_base_framework (ignored) [phlrr4019:30856] mca: base: component_find: unable to open /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_shmem_mmap: /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_shmem_mmap.so: undefined symbol: opal_show_help (ignored)

It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer):

opal_shmem_base_select failed --> Returned value -1 instead of OPAL_SUCCESS


It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer):

opal_init failed --> Returned value Error (-1) instead of ORTE_SUCCESS


It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer):

ompi_mpi_init: ompi_rte_init failed --> Returned "Error" (-1) instead of "Success" (0)

An error occurred in MPI_Init on a NULL communicator MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, and potentially your MPI job) [phlrr4019:30856] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!

carlos-pinto-coelho-microsoft commented 7 years ago

I was searching around and found

https://svn.open-mpi.org/trac/ompi/wiki/Linkers

To debug the problem I reduced this to a simple plugin foo that I build using /usr/local/mpi/bin/mpic++ -std=c++11 -shared foo.cpp -o foo.so -fPIC -I $TF_INC -O2 and that just calls MPI_Init inside.

Presumably this is similar to what mpi4py does but mpi4py has some dlopen “stuff” in https://bitbucket.org/mpi4py/mpi4py/src/eaf4f475857ec2330ef4781289328d1d44068460/src/dynload.c?at=master&fileviewer=file-view-default

Borrowing this static void PyMPI_OPENMPI_dlopen_libmpi(void) { void handle = 0; int mode = RTLD_NOW | RTLD_GLOBAL; / GNU/Linux and others */

ifdef RTLD_NOLOAD

mode |= RTLD_NOLOAD;

endif

if (!handle) handle = dlopen("libmpi.so.20", mode); if (!handle) handle = dlopen("libmpi.so.12", mode); if (!handle) handle = dlopen("libmpi.so.1", mode); if (!handle) handle = dlopen("libmpi.so.0", mode); if (!handle) handle = dlopen("libmpi.so", mode); }

static int PyMPI_OPENMPI_MPI_Init(int *argc, char ***argv) { PyMPI_OPENMPI_dlopen_libmpi(); return MPI_Init(argc, argv); }

undef MPI_Init

define MPI_Init PyMPI_OPENMPI_MPI_Init

static int PyMPI_OPENMPI_MPI_Init_thread(int *argc, char **argv, int required, int provided) { PyMPI_OPENMPI_dlopen_libmpi(); return MPI_Init_thread(argc, argv, required, provided); }

undef MPI_Init_thread

define MPI_Init_thread PyMPI_OPENMPI_MPI_Init_thread

from mpi4py "fixes" the issue for my build but perhaps it would be useful to make the two libs coexist together a bit better.

qianglan commented 7 years ago

@carlos-pinto-coelho-microsoft , hi , I also met this problem, I am not clearly understand your solution. First, the reason why the problem happened is because MPI_init function is not called in allreduce-test.py, am I right? so you create a new file which is foo.cpp, the file does the MPI_Init thing. After you compile the foo.cpp file to a library, how do you call foo.so in allreduce-test.cpp ?

qianglan commented 7 years ago

It seems the reason is that I install two version of MPI, and some environment like LD_LIBARARY_PATH , I don't change that, also it seems that I need to --disable-dlopen during the configure stage of OpenMPI,

carlos-pinto-coelho-microsoft commented 7 years ago

@qianglan the MPI_init in the background thread but it fails for some openmpi builds. The stuff I added above was taken from mpi4py and "fixes" the issue so I don't have to mess around with my openmpi build, which I don't even control in our cluster.

The other issue that I had with the code is that the way it calls MPI_Init in the background thread prevents this from working with other libraries that also call MPI_Init such as mpi4py. Locally, I ended up modifying the code to do the MPI_Init explicitly (or not at all for the case when I was using this and mpi4py in the same program).

plegresl commented 7 years ago

The typical way to write a library that uses MPI is to first call MPI_Initialized(), to check if MPI_Init() has already been called, and skip the MPI_Init() call within that library if it has already been called somewhere else.

qianglan commented 7 years ago

OK, so the problem happened because of twice MPI_Init execute. Thanks

qianglan commented 7 years ago

A simple solution is add below in allreduce-test.py file:

import ctypes ctypes.CDLL("libmpi.so", mode=ctypes.RTLD_GLOBAL)