JohannesBuchner / PyMultiNest

Pythonic Bayesian inference and visualization for the MultiNest Nested Sampling Algorithm and PyCuba's cubature algorithms.
http://johannesbuchner.github.io/PyMultiNest/
Other
192 stars 88 forks source link

Problems enabling MPI : (Abort trap: 6) #45

Open tfish13 opened 9 years ago

tfish13 commented 9 years ago

Hello!

PyMultinest is running fine on my Mac laptop. However, when I try enabling MPI, I'm getting some problems. Running the pymultinest_demo_minimal.py script results in the following:

bash-3.2$ mpirun -np 2 python pymultinest_demo_minimal.py

...

Acceptance Rate: 0.722698 Replacements: 1350 Total Samples: 1868 Nested Sampling ln(Z): 148.573654 Importance Nested Sampling ln(Z): 235.196277 +/- 0.411084

python(44441,0x7fff73f6a310) malloc: * error for object 0x10104ea08: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug

Acceptance Rate: 0.716113 Replacements: 1400 Total Samples: 1955 Nested Sampling ln(Z): 155.709924 Importance Nested Sampling ln(Z): 235.109719 +/- 0.393148

...

Acceptance Rate: 0.675898 Replacements: 2050 Total Samples: 3033 Nested Sampling ln(Z): 214.719858

Importance Nested Sampling ln(Z): 235.306598 +/- 0.175543

mpirun noticed that process rank 0 with PID 44441 on node my-computer exited on signal 6 (Abort trap: 6).

I've checked previous MPI issues here to see if others have experienced this, but found nothing similar. Additionally, original multinest examples (eggboxC, etc.) using mpirun are successful.

JohannesBuchner commented 9 years ago

Hi @tfish13,

thank you for reporting the issue. It seems to me that the problem occurs at the moment MultiNest exits. Perhaps MPI is terminated twice?

JohannesBuchner commented 9 years ago

Dear tfish, I was wondering if you still have this problem. I have now released an update to PyMultiNest, which includes some MPI-related fixes (setting init_MPI to False by default). If you can, please upgrade and test it. Thank you, Johannes

avadhesh3824 commented 8 years ago

Dear Sir, I am facing some problem to run mpi programe on MAC laptop. When I am running the program and give the number of processor mpirun -np 2. I have got the output on screen. Its run only one processor. I am not able to run the programe on all core or processor. Please suggest me how are resolve this issue. Thanks

JohannesBuchner commented 6 years ago

Please reopen if the problems still exist. You need to install mpi4py to use MPI functionality.

exowanderer commented 6 years ago

Dear @JohannesBuchner, I was previously able to run multinest on my mac for many months. Then I stepped away from the project and came back to re-install it on a (three) new systems.

Now, openmpi (mpiexec -n 4) works great on the normal MultiNest installation examples. But it crashes for using pymultinest_demo_minimal.py -- and for my own code that uses pymultinest.

Could we please re-open this issue? Or, do you know of a work around from this?

Thank you,

Jonathan

exowanderer commented 6 years ago

Update: I previously installed pymultinest from pip install pymultinest, which crashed often (same error as above); but now I installed it from the github repo (here) instead.

All small tests ran successfully; and I am currently running a long integration that has not yet crashed.

It may be an issue between the version on PyPi vs this GitHub repo.

I hope that helps anyone else who may be having these problems.

JohannesBuchner commented 6 years ago

Did you use version 2.6? I don't recall changing anything recently w.r.t. the multinest calls, but maybe it doesn't like wheel packages. What happens if you run with "mpiexec -np 1"?

exowanderer commented 6 years ago

I installed:

(1) openmpi version 2.0.4 (manually; via command line) (2) MultiNest version v 3.10 (via github) (3) PyMultiNest version 2.6 (via github)

That seemed to work the best. If I installed PyMultiNest via pip or openmpi via brew, then I get the above errors:

mpirun noticed that process rank 0 with PID 44441 on node my-computer exited on signal 6 (Abort trap: 6).

kevinea42 commented 6 years ago

I have the same issue after installing openmpi, Multinest and PyMultiNest as per the instructions on [http://astrobetter.com/wiki/MultiNest+Installation+Notes].

exowanderer commented 6 years ago

I was able to get everything to work together, but I had to mix 2 streams of instruction sets. Here is how my versioning was successful:

Installing MPI with Fortran Support Now, download the OpenMPI source from https://www.open-mpi.org/software/ompi/v2.0/. I then followed the instructions of FAQ #7 (https://www.open-mpi.org/faq/?category=osx).
I'm installing into $HOME/openmpi (you don't want to replace your system install in /usr/)
So from inside the source directory I did:

./configure --prefix=$HOME/openmpi LD_LIBRARY_PATH=$DYLD_FALLBACK_LIBRARY_PATH 2>&1 | tee config.out

make -j 4 2>&1 | tee make.out

make install 2>&1 | tee install.out

export PATH=$HOME/openmpi/bin:$PATH 

echo "export PATH=$HOME/openmpi/bin:$PATH" >> ~/.profile 

## OR, if you use 'bash_profile'

echo "export PATH=$HOME/openmpi/bin:$PATH" >> ~/.bash_profile

If you get problems telling you it was "unable to run and compile a simple Fortran program" check the output that it is using the Fortran compiler you expect and that you have your associated libgfortran.3.dylib in your DYLD_FALLBACK_LIBRARY_PATH.

Check that you successfully built the OpenMPI Fortran compiler with

bash$ mpif90 -v

If there is no error at the end of the output you should be fine.


I think that "brew install openmpi" and "port install openmpi" are having trouble with MultiNest; but (and I am guessing) the manual (local) install openmpi may communicate better with MultiNest.

Everything else I followed from the AstroBetter website [http://astrobetter.com/wiki/MultiNest+Installation+Notes]

The following are copied directly from the AstroBetter website

pip install mpipy

git clone https://github.com/JohannesBuchner/MultiNest.git
cd MultiNest/build/
cmake ..
make
sudo make install
git clone https://github.com/JohannesBuchner/PyMultiNest.git 
cd PyMultiNest 
python setup.py install
JohannesBuchner commented 6 years ago

I would be glad if someone could help improve the documentation with MacOS. Since I don't own one, it is difficult for me to debug issues.

Improving the documentation here would be great: https://github.com/JohannesBuchner/PyMultiNest/blob/master/doc/install.rst as well as updating astrobetter. (btw, pymultinest can just be installed with pip too)

I suspect there are several configurations and OS versions that need to be considered?

exowanderer commented 6 years ago

Indeed, my versions are:

Python 3.6.4 Anaconda3-5.1.0 MacOSX x86_64 version 10.13.3 ('High Sierra')

exowanderer commented 6 years ago

Moreover, I did try to install with pip install pymultinest but think that I remember it did not work. That is a very loose think.

bhaskar-astro commented 2 years ago

I am running a PE job using pymultinest in a slurm cluster. For some reason the MPI job is getting killed at the end after doing all the calculations! This is the error I am getting:

=================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 11837 RUNNING AT c04n05 = EXIT CODE: 9 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11) This typically refers to a problem with your application. Please see the FAQ page for debugging suggestions

Could please help me out with this?