Closed juliencelia closed 3 years ago
This environment does not try to load a libmpi.so.20
but a libmpi.so.40
.
Can you post the result of ldd smilei
?
/ccc/work/cont003/ra5390/bonvalej/Smilei/Smilei_hub/smileirome: error while loading shared libraries: libhdf5.so.10: cannot open shared object file: No such file or directory
I can't access your directory (ask to the hotline to add my login to your project if you want), and even if I could, the result depends of the environment set when the command is executed.
Could you answer to the question ? If the question is not clear, tell me.
j'ai ceci pour le binaire généré par la hotline:
linux-vdso.so.1 => (0x00007ffe715cc000)
libhdf5.so.10 => not found
libpython3.7m.so.1.0 => not found
libm.so.6 => /lib64/libm.so.6 (0x00002b989a4cd000)
libmpi_cxx.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libmpi_cxx.so.40 (0x00002b989a7cf000)
libmpi.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libmpi.so.40 (0x00002b989a9eb000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00002b989ad26000)
libiomp5.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libiomp5.so (0x00002b989b02d000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002b989b422000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b989b638000)
libc.so.6 => /lib64/libc.so.6 (0x00002b989b854000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002b989bc22000)
/lib64/ld-linux-x86-64.so.2 (0x00002b989a2a9000)
libopen-rte.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libopen-rte.so.40 (0x00002b989be26000)
libopen-pal.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libopen-pal.so.40 (0x00002b989c0eb000)
librt.so.1 => /lib64/librt.so.1 (0x00002b989c3b0000)
libutil.so.1 => /lib64/libutil.so.1 (0x00002b989c5b8000)
libz.so.1 => /lib64/libz.so.1 (0x00002b989c7bb000)
libhwloc.so.15 => /ccc/products/hwloc-2.0.4/system/default/lib/libhwloc.so.15 (0x00002b989c9d1000)
libudev.so.1 => /lib64/libudev.so.1 (0x00002b989cc1c000)
libpciaccess.so.0 => /lib64/libpciaccess.so.0 (0x00002b989ce32000)
libxml2.so.2 => /lib64/libxml2.so.2 (0x00002b989d03c000)
libevent-2.0.so.5 => /lib64/libevent-2.0.so.5 (0x00002b989d3a6000)
libevent_pthreads-2.0.so.5 => /lib64/libevent_pthreads-2.0.so.5 (0x00002b989d5ee000)
libimf.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libimf.so (0x00002b989d7f1000)
libirng.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libirng.so (0x00002b989de76000)
libcilkrts.so.5 => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libcilkrts.so.5 (0x00002b989e1e1000)
libintlc.so.5 => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libintlc.so.5 (0x00002b989e41e000)
libsvml.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libsvml.so (0x00002b989e690000)
libcap.so.2 => /lib64/libcap.so.2 (0x00002b98a011c000)
libdw.so.1 => /lib64/libdw.so.1 (0x00002b98a0321000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00002b98a0572000)
libattr.so.1 => /lib64/libattr.so.1 (0x00002b98a0798000)
libelf.so.1 => /lib64/libelf.so.1 (0x00002b98a099d000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x00002b98a0bb5000)
Et pour ma version : ldd smileirome linux-vdso.so.1 => (0x00007ffeb3483000) libhdf5.so.10 => not found libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00002b77bbd5e000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b77bc12a000) libdl.so.2 => /lib64/libdl.so.2 (0x00002b77bc346000) libutil.so.1 => /lib64/libutil.so.1 (0x00002b77bc54a000) libm.so.6 => /lib64/libm.so.6 (0x00002b77bc74d000) libmpi_cxx.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libmpi_cxx.so.40 (0x00002b77bca4f000) libmpi.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libmpi.so.40 (0x00002b77bcc6b000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00002b77bcfa6000) libiomp5.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libiomp5.so (0x00002b77bd2ad000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002b77bd6a2000) libc.so.6 => /lib64/libc.so.6 (0x00002b77bd8b8000) /lib64/ld-linux-x86-64.so.2 (0x00002b77bbb3a000) libopen-rte.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libopen-rte.so.40 (0x00002b77bdc86000) libopen-pal.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libopen-pal.so.40 (0x00002b77bdf4b000) librt.so.1 => /lib64/librt.so.1 (0x00002b77be210000) libz.so.1 => /lib64/libz.so.1 (0x00002b77be418000) libhwloc.so.15 => /ccc/products/hwloc-2.0.4/system/default/lib/libhwloc.so.15 (0x00002b77be62e000) libudev.so.1 => /lib64/libudev.so.1 (0x00002b77be879000) libpciaccess.so.0 => /lib64/libpciaccess.so.0 (0x00002b77bea8f000) libxml2.so.2 => /lib64/libxml2.so.2 (0x00002b77bec99000) libevent-2.0.so.5 => /lib64/libevent-2.0.so.5 (0x00002b77bf003000) libevent_pthreads-2.0.so.5 => /lib64/libevent_pthreads-2.0.so.5 (0x00002b77bf24b000) libimf.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libimf.so (0x00002b77bf44e000) libirng.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libirng.so (0x00002b77bfad3000) libcilkrts.so.5 => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libcilkrts.so.5 (0x00002b77bfe3e000) libintlc.so.5 => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libintlc.so.5 (0x00002b77c007b000) libsvml.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libsvml.so (0x00002b77c02ed000) libcap.so.2 => /lib64/libcap.so.2 (0x00002b77c1d79000) libdw.so.1 => /lib64/libdw.so.1 (0x00002b77c1f7e000) liblzma.so.5 => /lib64/liblzma.so.5 (0x00002b77c21cf000) libattr.so.1 => /lib64/libattr.so.1 (0x00002b77c23f5000) libelf.so.1 => /lib64/libelf.so.1 (0x00002b77c25fa000) libbz2.so.1 => /lib64/libbz2.so.1 (0x00002b77c2812000)
You need to reinstall mpi4py in the targeted environment.
(You can also to try to do without, the problem that you observed on KNL could be less critical on a more classical architecture).
It seems to work now ;) I am happi! Just a general question: what smilei is doing during "parsing input.py"?
Just a general question: what smilei is doing during "parsing input.py"?
It reads the namelist !
Great !
More precisely, it runs the namelist as a Python script. In your case, it reads the hydro file and interpolates the read quantities.
It can take times.
Yes it seems to be long. Wait and see. Actually, smilei works on IRENE. The env used is:
module purge module load intel/19.0.5.281 module load mpi/openmpi/4.0.2 module load flavor/hdf5/parallel hdf5/1.8.20 export HDF5_ROOT_DIR=${HDF5_ROOT} export PYTHONEXE=${PYTHON3_EXEDIR} module load python3/3.7.5
To compile, I put the no_mpi_tm config option as you advice.
To use Scipy, before ccc_mprun hotline added : export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PYTHON3_ROOT/lib
I am afraid that I have again a problem: in my out, I am locked on "python preprocess function does not exist" since 1hour... Could it be normal?
You could try to use the binary I compiled with a script inspired by mine especially concerning module used (with a mpi4py
installed in this environment).
Both are available in /ccc/work/cont003/smilei/derouilj/Issue291
.
You will find too in this directory a namelist derived from yours and outputs from a 10 minutes single node run (the idea was to check that python function are executed). A few interpolation are not performed during these 10 minutes, 52 interp_prof
printed while 56 are expected but there are operated on a very large grid (25600 x 20480) without been distributed. To accelerate, you can do the first interpolation with 1 process, while another MPI process is doing another interpolation ...
But in a first can you confirm that this test is going further than yours or not ?
I copied your folder and your binary. I add the hydro.txt file in the folder.
I have always this issue: ImportError: libmpi.so.20: cannot open shared object file: No such file or directory
linux-vdso.so.1 => (0x00007ffc8b7ad000)
/opt/selfie-1.0.2/lib64/selfie.so (0x00002ac55ce98000)
libhdf5.so.10 => /ccc/products/hdf5-1.8.20/intel--19.0.5.281__openmpi--4.0.1/parallel/lib/libhdf5.so.10 (0x00002ac55d11f000)
libpython2.7.so.1.0 => /ccc/products/python-2.7.14/intel--17.0.4.196__openmpi--2.0.2/default/lib/libpython2.7.so.1.0 (0x00002ac55d6de000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002ac55dda0000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002ac55dfbc000)
libutil.so.1 => /lib64/libutil.so.1 (0x00002ac55e1c0000)
libm.so.6 => /lib64/libm.so.6 (0x00002ac55e3c3000)
libmpi_cxx.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libmpi_cxx.so.40 (0x00002ac55e6c5000)
libmpi.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libmpi.so.40 (0x00002ac55e8e1000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00002ac55ec1c000)
libiomp5.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libiomp5.so (0x00002ac55ef23000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002ac55f318000)
libc.so.6 => /lib64/libc.so.6 (0x00002ac55f52e000)
libyaml-0.so.2 => /lib64/libyaml-0.so.2 (0x00002ac55f8fc000)
libz.so.1 => /ccc/products/python-2.7.14/intel--17.0.4.196__openmpi--2.0.2/default/lib/libz.so.1 (0x00002ac55fb1c000)
libimf.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libimf.so (0x00002ac55fe4b000)
libsvml.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libsvml.so (0x00002ac5604d0000)
libirng.so => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libirng.so (0x00002ac561f5c000)
libintlc.so.5 => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libintlc.so.5 (0x00002ac5622c7000)
libirc.so => /ccc/products2/ifort-17.0.4.196/Atos_7__x86_64/system/default/lib/intel64/libirc.so (0x00002ac562539000)
/lib64/ld-linux-x86-64.so.2 (0x00002ac55cc74000)
libopen-rte.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libopen-rte.so.40 (0x00002ac5627a3000)
libopen-pal.so.40 => /ccc/products/openmpi-4.0.2/intel--19.0.5.281/default/lib/libopen-pal.so.40 (0x00002ac562a68000)
librt.so.1 => /lib64/librt.so.1 (0x00002ac562d2d000)
libhwloc.so.15 => /ccc/products/hwloc-2.0.4/system/default/lib/libhwloc.so.15 (0x00002ac562f35000)
libudev.so.1 => /lib64/libudev.so.1 (0x00002ac563180000)
libpciaccess.so.0 => /lib64/libpciaccess.so.0 (0x00002ac563396000)
libxml2.so.2 => /ccc/products/python-2.7.14/intel--17.0.4.196__openmpi--2.0.2/default/lib/libxml2.so.2 (0x00002ac5635a0000)
libevent-2.0.so.5 => /lib64/libevent-2.0.so.5 (0x00002ac563c60000)
libevent_pthreads-2.0.so.5 => /lib64/libevent_pthreads-2.0.so.5 (0x00002ac563ea8000)
libcilkrts.so.5 => /ccc/products/ifort-19.0.5.281/system/default/19.0.5.281/lib/intel64/libcilkrts.so.5 (0x00002ac5640ab000)
libcap.so.2 => /lib64/libcap.so.2 (0x00002ac5642e8000)
libdw.so.1 => /lib64/libdw.so.1 (0x00002ac5644ed000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00002ac56473e000)
libattr.so.1 => /lib64/libattr.so.1 (0x00002ac564964000)
libelf.so.1 => /lib64/libelf.so.1 (0x00002ac564b69000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x00002ac564d81000)
_ _
| | _ \ \ Version : v4.4-784-gc3f8cc81-master / _| (_) | | () | | _ \ | ' \ | | / -) | | |__/ |||| || || _| || | | //
HDF5 version 1.8.20 Python version 2.7.14 Parsing pyinit.py Parsing v4.4-784-gc3f8cc81-master Parsing pyprofiles.py Parsing BNH2d.py On rank 12 [Python] ImportError: libmpi.so.20: cannot open shared object file: No such file or directory ERROR src/Params/Params.cpp:1283 (runScript) error parsing BNH2d.py
This morning you was using another Python environment, can you confirm that you reinstall the mpi4py
module in this environment ?
No I did not. It is hotline that compile smilei with the openmpi env and python/3.7
I tried a simple run with a simulation of a 2D gaussian laser in an empty box. Smilei works with this configuration.
Hi Julien, I know that the situation is not completely stabilized since the opening of this issue but the problem evolved a lot (KNL, Rome, MPI, Python, deadlocks ...) and it runs. I propose you to close this issue and if necessary to open a new one dedicated to your eventual new problem.
Dear Smilei experts,
Hope all of you are fine!
I have a simulation that does not begin. I am afraid of having a too big simulation : 25000 * 20000 cells in 2D but I am not sure. The error message is:
Invalid knl_memoryside_cache header, expected "version: 1". [irene3354][[26206,0],315][btl_portals4_component.c:1115] mca_btl_portals4_component_progress_event() ERROR 0: PTL_EVENT_ACK with ni_fail_type 10 (PTL_NI_TARGET_INVALID) with target (nid=508,pid=73) and initator (nid=507,pid=73) found Stack trace (most recent call last):
14 Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in
13 Object "./smileiKNL", at 0x458568, in
12 Object "/lib64/libc.so.6", at 0x2b3e9d86f544, in __libc_start_main
11 Object "./smileiKNL", at 0x8f379f, in main
10 Object "./smileiKNL", at 0x6e93ab, in Params::Params(SmileiMPI*, std::vector<std::string, std::allocator >)
9 Object "/opt/selfie-1.0.2/lib64/selfie.so", at 0x2b3e9b907ab7, in MPI_Barrier
8 Object "/ccc/products/openmpi-2.0.4/intel--17.0.6.256/default/lib/libmpi.so.20", at 0x2b3e9ccdaea0, in MPI_Barrier
7 Object "/ccc/products/openmpi-2.0.4/intel--17.0.6.256/default/lib/libmpi.so.20", at 0x2b3e9cd15a82, in ompi_coll_base_barrier_intra_bruck
6 Object "/opt/mpi/openmpi-icc/2.0.4.5.10.xcea/lib/openmpi/mca_pml_ob1.so", at 0x2b3ea7b527a6, in mca_pml_ob1_send
5 Object "/opt/mpi/openmpi-icc/2.0.4.5.10.xcea/lib/libopen-pal.so.20", at 0x2b3e9ff69330, in opal_progress
4 Object "/opt/mpi/openmpi-icc/2.0.4.5.10.xcea/lib/openmpi/mca_btl_portals4.so", at 0x2b3ea5fd384d, in mca_btl_portals4_component_progress
3 Object "/opt/mpi/openmpi-icc/2.0.4.5.10.xcea/lib/openmpi/mca_btl_portals4.so", at 0x2b3ea5fd3a59, in mca_btl_portals4_component_progress_event
2 Object "/opt/mpi/openmpi-icc/2.0.4.5.10.xcea/lib/libmca_common_portals4.so.20", at 0x2b3ea61defd8, in common_ptl4_printf_error
1 Object "/lib64/libc.so.6", at 0x2b3e9d884a67, in abort
0 Object "/lib64/libc.so.6", at 0x2b3e9d883377, in gsignal
Aborted (Signal sent by tkill() 150381 35221)
The simulation stops at : HDF5 version 1.8.20 Python version 2.7.14 Parsing pyinit.py Parsing v4.4-706-gb5c12a5a-master Parsing pyprofiles.py Parsing BNH2d.py Parsing pycontrol.py Check for function preprocess() python preprocess function does not exist
The version of Smilei is : v4.4-706-gb5c12a5a-master
Thanks for your help. Here is the input:
BNH2d.txt