headmyshoulder / odeint-v2

odeint - solving ordinary differential equations in c++ v2
http://headmyshoulder.github.com/odeint-v2/
Other
340 stars 101 forks source link

MPI example fails for more than one process (if not bjamed) #152

Closed pranavcode closed 9 years ago

pranavcode commented 9 years ago

Not sure if this is the correct place to report this (apologies and please divert me to correct location!).

I am trying to execute the example for MPI (Phase Chain) given here odeint-v2/examples/mpi/phase_chain.cpp. For more than 2 processes, it is giving a Segmentation Fault. As far as I understand, this can occur in case of undesired memory access and I investigated the code, but problem is not apparent.

The code I am trying is as given here odeint-v2/examples/mpi/phase_chain.cpp and here is failing execution (mpirun) with 2 or more processes - the error log.

My setup:

Ubuntu 14.04 and Boost version 1.56.0

$ mpicc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.8.2-19ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.8 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libmudflap --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) 

$ mpirun --version
mpirun (Open MPI) 1.6.5
Report bugs to http://www.open-mpi.org/community/help/

$ uname -mrso
Linux 3.13.0-45-generic x86_64 GNU/Linux

$ lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 55
Stepping:              8
CPU MHz:               1494.000
BogoMIPS:              4333.50
Virtualization:        VT-x
L1d cache:             24K
L1i cache:             32K
L2 cache:              1024K
NUMA node0 CPU(s):     0,1

With all the Boost libraries installed under /usr/lib/x86_64-linux-gnu/libboost_*, the compilation is as follow:

mpic++ phase_chain.cpp -o phase_chain -lboost_mpi -lboost_serialization -lboost_system -lboost_timer

And successful execution for 1 process

$ mpirun -n 1 ./odeint_mpi
28.6798s

But, for 2 or more processes it throws a Segmentation Fault.

Am I doing something wrong here? Is this program failing for anyone else, or is it just me? Am I expected to do changes to the program before compiling/running according to my environment?

Really appreciate the help.

mariomulansky commented 9 years ago

Thanks for reporting this problem. This code is supposed to run without any modifications. Unfortunately, the parallel implementations are not tested as thoroughly as the rest of the library. I will try to reproduce the problem and see if I can find a solution.

mariomulansky commented 9 years ago

I tried on my machine with gcc 4.8.2, intel 15.0.0 and clang 3.4 and I could not reproduce your problem.

My only suggestion is now to try to build the example using bjam, maybe this gives the correct results...

pranavcode commented 9 years ago

@mariomulansky, building the example with bjam helped. Thanks.

Yet I am not sure of what is missing from the command line compilation, with mpic++ above? My build configuration is similar to Jamfile, and still execution fails for more that one process.

Also, I tried on two other boxes, with clang 3.5 (on OS X 10.10.2, Ubuntu 14.04) and gcc 4.8.2 (on Ubuntu 14.04). Boost installations are done using bjam. I used mpic++. Example successfully ran for clang, but failed for g++. And bjam worked on both the boxes.

I will try to spend some more time to come up with justifiable conclusions. Meanwhile, if, one can provide hint in this regard, it will be appreciated. Thanks!

I have updated the gist for Xeon E5-2680 box (gcc 4.8.2 and clang 3.5, Ubuntu 14.04).

mariomulansky commented 9 years ago

if you compile with bjam -d2 you can see the exact compilation command used from bjam. maybe this helps to create an equivalent make file?

pranavcode commented 9 years ago

@mariomulansky, I will try doing that. Thanks.

Any thoughts on more than one process execution issue using mpic++? What was the execution environment you were not able to reproduce my error on?

mariomulansky commented 9 years ago

I'm not sure what exactly you mean with "more than one process execution issue" now? Does everything work now if you compile with bjam, or are you still having problems even there?

My test system is a Kubuntu 14.04 with an Intel Core-i5 3210M

pranavcode commented 9 years ago

@mariomulansky, thanks for your patience and keeping this issue alive.

Compilation with bjam is as smooth as butter. bjam -d2 does a wonderful job of showing a verbose output of all that it is doing in background. I am trying to replicate it with make and it will take little more time than what I presumed. Till then it would be great if we can keep this issue live.

My bad on using "more than one process execution issue" wording. By that I meant, mpic++ compiles the code, but fails only when more than one process is involved. Here's what I think and I might be wrong, the build script that compiles the example code passes some compiler configuration flags that are missing from simple mpic++ compilation. And so in the case of compilation with simple mpic++ execution with more than one process fails. I am not sure how to validate this (yet). Any thoughts.

/cc @headmyshoulder @neapel Apart from the issue mentioned, the MPI example (including OpenMP and CUDA examples) gives out the result at the end of integration through all the steps and does not demonstrate observation of intermediate states. How would one go about implementing observer efficiently to report all the intermediate states?

mariomulansky commented 9 years ago

It might be that the problem arises because some mpi libraries that are linked are compiled with specific flags, while your code is note if you simply use mpicc. With using bjam, however, it is sure that all binaries are compiled with compatible flags and no problem occurs.

pranavcode commented 9 years ago

@mariomulansky, Thanks. One has to take these compiler flags into consideration.

pranavcode commented 9 years ago

Hi,

The MPI example (also OpenMP and CUDA examples) gives out the result at the end of integration through all the steps and does not demonstrate observation of intermediate states. The code is delivers no value if intermediate states are not observed or recorded.

How would one go about implementing observer efficiently to report the intermediate states?

pranavcode commented 9 years ago

Closing this issue for satisfied resolution and will open a new issue with observer specific query. So that it will be addressed. Thanks.

mariomulansky commented 8 years ago

Hello!

New message, please read http://www.happyhotpizza.hu/first.php?pa5kf

mario.mulansky@gmx.net

mariomulansky commented 8 years ago

Hello!

New message, please read http://yemenlikahveci.com/last.php?1

mario.mulansky@gmx.net

mariomulansky commented 8 years ago

Hey, This news is gonna amaze you, I'm telling you! Just check it out http://shicronkacu.iwqwproductions.com/aeihl Warmest regards, mario.mulansky@gmx.net

mariomulansky commented 8 years ago

Hey,

This is what I've just read and it's something really new and interesting, you can read more at http://hyrdeslongi.africansview.com/aejtnb

Best regards, mario.mulansky@gmx.net

mariomulansky commented 7 years ago

Hey friend,

Our friend have written to me a couple of days ago, they have a surprise for you, just take a look http://belief.calcyclingtours.com/aeklxda

Typos courtesy of my iPhone, mario.mulansky

mariomulansky commented 7 years ago

Dear!

I've recently found that article on the web and the facts I had found in it were just shocking! Pleaseread it here http://line.walkiriaerodriguez.com/aewzvvq

Yours sincerely, mario.mulansky