adrn / schwimmbad

A common interface to processing pools.
MIT License
115 stars 18 forks source link

MPIPool is not working in python 3 environment #31

Closed bikashd18 closed 4 years ago

bikashd18 commented 4 years ago

I am using exactly the same code mpi-demo.py and running the code in the python 3 environment by the command: mpiexec -n 2 python3 mpi-demo.py. But, I am getting an error given below:

bikash@b:~$ cd Documents/data_simulation/
bikash@b:~/Documents/data_simulation$ mpiexec -n 2 python3 mpi-demo.py
[b:04791] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_shmem_mmap: /usr/lib/openmpi/lib/openmpi/mca_shmem_mmap.so: undefined symbol: opal_show_help (ignored)
[b:04791] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_shmem_posix: /usr/lib/openmpi/lib/openmpi/mca_shmem_posix.so: undefined symbol: opal_shmem_base_framework (ignored)
[b:04791] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_shmem_sysv: /usr/lib/openmpi/lib/openmpi/mca_shmem_sysv.so: undefined symbol: opal_show_help (ignored)
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_shmem_base_select failed
  --> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_init failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[b:4791] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[37877,1],0]
  Exit code:    1
--------------------------------------------------------------------------
bikash@b:~/Documents/data_simulation$ 

Can anyone please suggest me how to recover this issue? Thanks!

adrn commented 4 years ago

Hi @bikashd18 - a few questions:

bikashd18 commented 4 years ago

I am using ubuntu Ubuntu 16.04.6 LTS (xenial).

The python version is 3.5.2. I do have python version 2.7.12 simultaneously. I hope that is not an issue here.

The version of OpenMPI is 1.10.2.

The version of mpi4py is 1.3.1

Finally, the details of OpenMPI is given below (using the command "ompi_info"):

             Package: Open MPI buildd@lgw01-57 Distribution
            Open MPI: 1.10.2

Open MPI repo revision: v1.10.1-145-g799148f Open MPI release date: Jan 21, 2016 Open RTE: 1.10.2 Open RTE repo revision: v1.10.1-145-g799148f Open RTE release date: Jan 21, 2016 OPAL: 1.10.2 OPAL repo revision: v1.10.1-145-g799148f OPAL release date: Jan 21, 2016 MPI API: 3.0.0 Ident string: 1.10.2 Prefix: /usr Configured architecture: x86_64-pc-linux-gnu Configure host: lgw01-57 Configured by: buildd Configured on: Thu Feb 25 16:33:01 UTC 2016 Configure host: lgw01-57 Built by: buildd Built on: Thu Feb 25 16:40:59 UTC 2016 Built host: lgw01-57 C bindings: yes C++ bindings: yes Fort mpif.h: yes (all) Fort use mpi: yes (full: ignore TKR) Fort use mpi size: deprecated-ompi-info-value Fort use mpi_f08: yes Fort mpi_f08 compliance: The mpi_f08 module is available, but due to limitations in the gfortran compiler, does not support the following: array subsections, direct passthru (where possible) to underlying Open MPI's C functionality Fort mpi_f08 subarrays: no Java bindings: no Wrapper compiler rpath: runpath C compiler: gcc C compiler absolute: /usr/bin/gcc C compiler family name: GNU C compiler version: 5.3.1 C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fort compiler: gfortran Fort compiler abs: /usr/bin/gfortran Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::) Fort 08 assumed shape: yes Fort optional args: yes Fort INTERFACE: yes Fort ISO_FORTRAN_ENV: yes Fort STORAGE_SIZE: yes Fort BIND(C) (all): yes Fort ISO_C_BINDING: yes Fort SUBROUTINE BIND(C): yes Fort TYPE,BIND(C): yes Fort T,BIND(C,name="a"): yes Fort PRIVATE: yes Fort PROTECTED: yes Fort ABSTRACT: yes Fort ASYNCHRONOUS: yes Fort PROCEDURE: yes Fort USE...ONLY: yes Fort C_FUNLOC: yes Fort f08 using wrappers: yes Fort MPI_SIZEOF: yes C profiling: yes C++ profiling: yes Fort mpif.h profiling: yes Fort use mpi profiling: yes Fort use mpi_f08 prof: yes C++ exceptions: no Thread support: posix (MPI_THREAD_MULTIPLE: no, OPAL support: yes, OMPI progress: no, ORTE progress: yes, Event lib: yes) Sparse Groups: no Internal debug support: no MPI interface warnings: yes MPI parameter check: runtime Memory profiling support: no Memory debugging support: no dl support: yes Heterogeneous support: yes mpirun default --prefix: no MPI I/O support: yes MPI_WTIME support: gettimeofday Symbol vis. support: yes Host topology support: yes MPI extensions: FT Checkpoint support: no (checkpoint thread: no) C/R Enabled Debugging: no VampirTrace support: no MPI_MAX_PROCESSOR_NAME: 256 MPI_MAX_ERROR_STRING: 256 MPI_MAX_OBJECT_NAME: 64 MPI_MAX_INFO_KEY: 36 MPI_MAX_INFO_VAL: 256 MPI_MAX_PORT_NAME: 1024 MPI_MAX_DATAREP_STRING: 128 MCA backtrace: execinfo (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA compress: bzip (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA compress: gzip (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA crs: none (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA db: hash (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA db: print (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA dl: dlopen (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA event: libevent2021 (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA hwloc: external (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA if: posix_ipv4 (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA if: linux_ipv6 (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA installdirs: env (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA installdirs: config (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA memory: linux (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA pstat: linux (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sec: basic (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA shmem: mmap (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA shmem: posix (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA shmem: sysv (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA timer: linux (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA dfs: orted (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA dfs: test (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA dfs: app (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA errmgr: default_hnp (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA errmgr: default_app (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA errmgr: default_orted (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA errmgr: default_tool (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA ess: tool (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA ess: hnp (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA ess: env (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA ess: singleton (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA filem: raw (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA grpcomm: bad (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA iof: mr_orted (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA iof: mr_hnp (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA iof: orted (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA iof: hnp (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA iof: tool (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA odls: default (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA oob: tcp (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA plm: isolated (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA plm: rsh (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA ras: simulator (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA ras: loadleveler (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA ras: gridengine (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: round_robin (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: seq (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: mindist (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: staged (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: ppr (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: rank_file (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rmaps: resilient (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rml: oob (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA routed: radix (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA routed: debruijn (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA routed: direct (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA routed: binomial (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA state: dvm (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: hnp (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: staged_hnp (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: staged_orted (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: tool (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: novm (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: app (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA state: orted (MCA v2.0.0, API v1.0.0, Component v1.10.2) MCA allocator: basic (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA allocator: bucket (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA bcol: basesmuma (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA bcol: ptpcoll (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA bml: r2 (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA btl: self (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA btl: vader (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA btl: openib (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA btl: sm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA btl: tcp (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: sm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: self (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: libnbc (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: ml (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: basic (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: inter (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: tuned (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA coll: hierarch (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA dpm: orte (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fbtl: posix (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fcoll: two_phase (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fcoll: static (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fcoll: individual (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fcoll: ylib (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fcoll: dynamic (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA fs: ufs (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA mpool: grdma (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA mpool: sm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA osc: sm (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA osc: pt2pt (MCA v2.0.0, API v3.0.0, Component v1.10.2) MCA pml: v (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA pml: cm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA pml: bfo (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA pml: ob1 (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA pubsub: orte (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rcache: vma (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA rte: orte (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sbgp: basesmuma (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sbgp: p2p (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sbgp: basesmsocket (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sharedfp: individual (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sharedfp: lockedfile (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA sharedfp: sm (MCA v2.0.0, API v2.0.0, Component v1.10.2) MCA topo: basic (MCA v2.0.0, API v2.1.0, Component v1.10.2) MCA vprotocol: pessimist (MCA v2.0.0, API v2.0.0, Component v1.10.2)

Thanks!

adrn commented 4 years ago

Not sure exactly what the issue is, but could you look at the first answer here and see if this could be the problem? https://stackoverflow.com/questions/36156822/error-when-starting-open-mpi-in-mpi-init-via-python

bikashd18 commented 4 years ago

Thanks! Adrian, I shall look into this issue.

bikashd18 commented 4 years ago

Hi! Adrian, That's excellent. Your above-mentioned suggestion solved the problem. Thanks!

adrn commented 4 years ago

Great!