SmileiPIC / Smilei

Particle-in-cell code for plasma simulation
https://smileipic.github.io/Smilei
343 stars 120 forks source link

Compilation with noopenmp and without Intel compiler #237

Closed egelfer closed 4 years ago

egelfer commented 4 years ago

Dear Smilei team, I'm trying to compile smilei on our cluster (ELI Beamlines). It doesn't have a support of several threads per process, and previously I did the compilation with the command "make config=noopenmp", and it worked at least up to the version 4.2. I also set export OMP_NUM_THREADS=1 when launched a simulation.

However, with the new version SMILEI 4.4 "make config=noopenmp" leads to an error, since we don't have Intel compiler. I also tried "make config=no_mpi_tm", the compilation finished without errors, but then simulations chush, if I try to use more than one core. Is it possible to compile with noopenmp option and gcc compiler?

mccoys commented 4 years ago

This should work. Could you please specify the errors?

jderouillat commented 4 years ago

There is indeed an IntelMPI option in the noopenmp mode. Could you remove lines 149 and 150 :

else
    LDFLAGS += -mt_mpi # intelmpi only
egelfer commented 4 years ago

Dear mccoys and jderouillat, thank you very much for the replies! In fact. commenting the lines else LDFLAGS += -mt_mpi # intelmpi only worked for the version 4.3. But now simulation crushes after such a compilation with the following error:

 Pre-processing LaserOffset
 --------------------------------------------------------------------------------
         LaserOffset #0
Stack trace (most recent call last):
#6    Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in
#5    Object "/gpfs/home/egelfer/smilei-v4.4/smilei", at 0x4558f0, in
#4    Object "/usr/lib64/libc.so.6", at 0x7f181c208af4, in __libc_start_main
#3    Object "/gpfs/home/egelfer/smilei-v4.4/smilei", at 0x45380f, in main
#2    Object "/gpfs/home/egelfer/smilei-v4.4/smilei", at 0x5c7337, in Params::Params(SmileiMPI*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)
#1    Object "/gpfs/home/egelfer/smilei-v4.4/smilei", at 0x4f25ba, in LaserPropagator::operator()(std::vector<_object*, std::allocator<_object*> >, std::vector<int, std::allocator<int> >, double, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, double)
#0 Stack trace (most recent call last):
   Object "#6    Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in
#5    Object "/gpfs/home/egelfer/smilei-v4.4/smilei", at 0x4558f0, in
#4    Object "/usr/lib64/libc.so.6", at 0x7fc7511d7af4, in __libc_start_main
#3    Object "/gpfs/home/egelfer/smilei-v4.4/smilei", at 0x45380f, in main
#2    Object "/gpfs/home/egelfer/smilei-v4.4/smilei", at 0x5c7337, in Params::Params(SmileiMPI*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)
#1    Object "/gpfs/home/egelfer/smilei-v4.4/smilei", at 0x4f25ba, in LaserPropagator::operator()(std::vector<_object*, std::allocator<_object*> >, std::vector<int, std::allocator<int> >, double, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, double)
#0    Object "/gpfs/apps/tools/modulefiles/software/Python/2.7.11-GCC-7.2.0/lib/python2.7/site-packages/numpy/core/multiarray.so", at 0x7fc74e452ccd, in
Unknown signal 1317074848
watf? exit
/gpfs/apps/tools/modulefiles/software/Python/2.7.11-GCC-7.2.0/lib/python2.7/site-packages/numpy/core/multiarray.so", at 0x7f181959cccd, in
Unknown signal 429234080
watf? exit
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[1846,1],9]
  Exit code:    1

The input file of the simulation is attached. It works with SMILEI 4.3 (compiled with removed lines 149 and 150) without errors namelist.txt

mccoys commented 4 years ago

There is another issue on this LaserOffset topic. Let me investigate.

mccoys commented 4 years ago

@jderouillat concerning the installation issue, is there something to change in the makefile ?

jderouillat commented 4 years ago

For me, it seems that there is no absolute solutions. This is a problem with an Intel environment (at least with some versions) which requires a dedicated option but the same kind of problems can occurs with other environment.
We can choice to manage options in the makefile, at a moment we interpreted mpirun --version to do so. But we will not be exhaustive.
We can choose to just remove it, in most case it will work.
Or we can assume that this Intel environment is our main development environment on supercomputers and let it.
In all case, we can document it, in the install page.

mccoys commented 4 years ago

the two lines have been removed in the last push.

The segfault has been fixed.