JohannesBuchner / PyMultiNest

Pythonic Bayesian inference and visualization for the MultiNest Nested Sampling Algorithm and PyCuba's cubature algorithms.
http://johannesbuchner.github.io/PyMultiNest/
Other
198 stars 89 forks source link

Problems enabling MPI : (Abort trap: 6) #45

Open tfish13 opened 10 years ago

tfish13 commented 10 years ago

Hello!

PyMultinest is running fine on my Mac laptop. However, when I try enabling MPI, I'm getting some problems. Running the pymultinest_demo_minimal.py script results in the following:

bash-3.2$ mpirun -np 2 python pymultinest_demo_minimal.py

...

Acceptance Rate: 0.722698 Replacements: 1350 Total Samples: 1868 Nested Sampling ln(Z): 148.573654 Importance Nested Sampling ln(Z): 235.196277 +/- 0.411084

python(44441,0x7fff73f6a310) malloc: * error for object 0x10104ea08: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug

Acceptance Rate: 0.716113 Replacements: 1400 Total Samples: 1955 Nested Sampling ln(Z): 155.709924 Importance Nested Sampling ln(Z): 235.109719 +/- 0.393148

...

Acceptance Rate: 0.675898 Replacements: 2050 Total Samples: 3033 Nested Sampling ln(Z): 214.719858

Importance Nested Sampling ln(Z): 235.306598 +/- 0.175543

mpirun noticed that process rank 0 with PID 44441 on node my-computer exited on signal 6 (Abort trap: 6).

I've checked previous MPI issues here to see if others have experienced this, but found nothing similar. Additionally, original multinest examples (eggboxC, etc.) using mpirun are successful.

JohannesBuchner commented 10 years ago

Hi @tfish13,

thank you for reporting the issue. It seems to me that the problem occurs at the moment MultiNest exits. Perhaps MPI is terminated twice?

JohannesBuchner commented 9 years ago

Dear tfish, I was wondering if you still have this problem. I have now released an update to PyMultiNest, which includes some MPI-related fixes (setting init_MPI to False by default). If you can, please upgrade and test it. Thank you, Johannes

avadhesh3824 commented 9 years ago

Dear Sir, I am facing some problem to run mpi programe on MAC laptop. When I am running the program and give the number of processor mpirun -np 2. I have got the output on screen. Its run only one processor. I am not able to run the programe on all core or processor. Please suggest me how are resolve this issue. Thanks

JohannesBuchner commented 7 years ago

Please reopen if the problems still exist. You need to install mpi4py to use MPI functionality.

exowanderer commented 6 years ago

Dear @JohannesBuchner, I was previously able to run multinest on my mac for many months. Then I stepped away from the project and came back to re-install it on a (three) new systems.

Now, openmpi (mpiexec -n 4) works great on the normal MultiNest installation examples. But it crashes for using pymultinest_demo_minimal.py -- and for my own code that uses pymultinest.

Could we please re-open this issue? Or, do you know of a work around from this?

Thank you,

Jonathan

exowanderer commented 6 years ago

Update: I previously installed pymultinest from pip install pymultinest, which crashed often (same error as above); but now I installed it from the github repo (here) instead.

All small tests ran successfully; and I am currently running a long integration that has not yet crashed.

It may be an issue between the version on PyPi vs this GitHub repo.

I hope that helps anyone else who may be having these problems.

JohannesBuchner commented 6 years ago

Did you use version 2.6? I don't recall changing anything recently w.r.t. the multinest calls, but maybe it doesn't like wheel packages. What happens if you run with "mpiexec -np 1"?

exowanderer commented 6 years ago

I installed:

(1) openmpi version 2.0.4 (manually; via command line) (2) MultiNest version v 3.10 (via github) (3) PyMultiNest version 2.6 (via github)

That seemed to work the best. If I installed PyMultiNest via pip or openmpi via brew, then I get the above errors:

mpirun noticed that process rank 0 with PID 44441 on node my-computer exited on signal 6 (Abort trap: 6).

kevinea42 commented 6 years ago

I have the same issue after installing openmpi, Multinest and PyMultiNest as per the instructions on [http://astrobetter.com/wiki/MultiNest+Installation+Notes].

exowanderer commented 6 years ago

I was able to get everything to work together, but I had to mix 2 streams of instruction sets. Here is how my versioning was successful:

Installing MPI with Fortran Support Now, download the OpenMPI source from https://www.open-mpi.org/software/ompi/v2.0/. I then followed the instructions of FAQ #7 (https://www.open-mpi.org/faq/?category=osx).
I'm installing into $HOME/openmpi (you don't want to replace your system install in /usr/)
So from inside the source directory I did:

./configure --prefix=$HOME/openmpi LD_LIBRARY_PATH=$DYLD_FALLBACK_LIBRARY_PATH 2>&1 | tee config.out

make -j 4 2>&1 | tee make.out

make install 2>&1 | tee install.out

export PATH=$HOME/openmpi/bin:$PATH 

echo "export PATH=$HOME/openmpi/bin:$PATH" >> ~/.profile 

## OR, if you use 'bash_profile'

echo "export PATH=$HOME/openmpi/bin:$PATH" >> ~/.bash_profile

If you get problems telling you it was "unable to run and compile a simple Fortran program" check the output that it is using the Fortran compiler you expect and that you have your associated libgfortran.3.dylib in your DYLD_FALLBACK_LIBRARY_PATH.

Check that you successfully built the OpenMPI Fortran compiler with

bash$ mpif90 -v

If there is no error at the end of the output you should be fine.


I think that "brew install openmpi" and "port install openmpi" are having trouble with MultiNest; but (and I am guessing) the manual (local) install openmpi may communicate better with MultiNest.

Everything else I followed from the AstroBetter website [http://astrobetter.com/wiki/MultiNest+Installation+Notes]

The following are copied directly from the AstroBetter website

pip install mpipy

git clone https://github.com/JohannesBuchner/MultiNest.git
cd MultiNest/build/
cmake ..
make
sudo make install
git clone https://github.com/JohannesBuchner/PyMultiNest.git 
cd PyMultiNest 
python setup.py install
JohannesBuchner commented 6 years ago

I would be glad if someone could help improve the documentation with MacOS. Since I don't own one, it is difficult for me to debug issues.

Improving the documentation here would be great: https://github.com/JohannesBuchner/PyMultiNest/blob/master/doc/install.rst as well as updating astrobetter. (btw, pymultinest can just be installed with pip too)

I suspect there are several configurations and OS versions that need to be considered?

exowanderer commented 6 years ago

Indeed, my versions are:

Python 3.6.4 Anaconda3-5.1.0 MacOSX x86_64 version 10.13.3 ('High Sierra')

exowanderer commented 6 years ago

Moreover, I did try to install with pip install pymultinest but think that I remember it did not work. That is a very loose think.

bhaskar-astro commented 2 years ago

I am running a PE job using pymultinest in a slurm cluster. For some reason the MPI job is getting killed at the end after doing all the calculations! This is the error I am getting:

=================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 11837 RUNNING AT c04n05 = EXIT CODE: 9 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11) This typically refers to a problem with your application. Please see the FAQ page for debugging suggestions

Could please help me out with this?