Closed JiamingZhuge closed 7 months ago
@JiamingZhuge could you provide a bit more info on what compiler you're using, and whether you are using the parallel hdf5? One way to check, is to ensure that h5pfc
exists, and is being used by the compiler. Another issue might be if you're using the wrong version of MPI (not the one that parallel hdf5 is compiled with).
Is this on a cluster?
module list
?conda
, apt
, brew
etc.Of course! Let me provide that information on cluster. I run in a cluster and load those modules:
module list
Currently Loaded Modules: 1) shared 3) cpu/0.17.3b (c) 5) ucx/1.10.1/dnpjjuc 7) openjdk/11.0.12_7/27cv2ps 9) anaconda3/2021.05/q4munrg 2) slurm/expanse/21.08.8 4) gcc/10.2.0/npcyll4 6) openmpi/4.1.3/oq3qvsv 8) hdf5/1.10.7/5o4oibc
Where: c: built natively for AMD Rome
Here the hdf5 is the openmpi version:
module spider hdf5/1.10.7/5o4oibc
hdf5/1.10.7: hdf5/1.10.7/5o4oibc
You will need to load all module(s) on any one of the lines below before the "hdf5/1.10.7/5o4oibc" module is available to load. cpu/0.17.3b gcc/10.2.0/npcyll4 openmpi/4.1.3/oq3qvsv Help: HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data.
As for the gcc:
gcc -v Reading specs from /cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gcc-10.2.0-npcyll4gxjhf4tejksmdzlsl3d3usqpd/lib/gcc/x86_64-pc-linux-gnu/10.2.0/specs COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gcc-10.2.0-npcyll4gxjhf4tejksmdzlsl3d3usqpd/libexec/gcc/x86_64-pc-linux-gnu/10.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /scratch/spack_cpu/job_21694812/spack-stage/spack-stage-gcc-10.2.0-npcyll4gxjhf4tejksmdzlsl3d3usqpd/spack-src/configure --prefix=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gcc-10.2.0-npcyll4gxjhf4tejksmdzlsl3d3usqpd --with-pkgversion='Spack GCC' --with-bugurl=https://github.com/spack/spack/issues --disable-multilib --enable-languages=c,c++,fortran --disable-nls --with-system-zlib --with-zstd-include=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/zstd-1.5.0-ixhjq2kjkwwiubjqtzompy3ovx3xskjy/include --with-zstd-lib=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/zstd-1.5.0-ixhjq2kjkwwiubjqtzompy3ovx3xskjy/lib --disable-bootstrap --with-mpfr-include=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/mpfr-4.1.0-2gn43ksz5mn4l2ydhukvmf2hc5n6lsu2/include --with-mpfr-lib=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/mpfr-4.1.0-2gn43ksz5mn4l2ydhukvmf2hc5n6lsu2/lib --with-gmp-include=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gmp-6.2.1-6d5recuzoijnpzdmyuyatwr32y6e756r/include --with-gmp-lib=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gmp-6.2.1-6d5recuzoijnpzdmyuyatwr32y6e756r/lib --with-mpc-include=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/mpc-1.1.0-7brtlqfdvz2iwdzeyd23igqlwz3fq4d5/include --with-mpc-lib=/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/mpc-1.1.0-7brtlqfdvz2iwdzeyd23igqlwz3fq4d5/lib --without-isl --with-stage1-ldflags='-Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gcc-10.2.0-npcyll4gxjhf4tejksmdzlsl3d3usqpd/lib -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gcc-10.2.0-npcyll4gxjhf4tejksmdzlsl3d3usqpd/lib64 -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gmp-6.2.1-6d5recuzoijnpzdmyuyatwr32y6e756r/lib -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/mpc-1.1.0-7brtlqfdvz2iwdzeyd23igqlwz3fq4d5/lib -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/mpfr-4.1.0-2gn43ksz5mn4l2ydhukvmf2hc5n6lsu2/lib -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/zlib-1.2.11-bmchsimapzrndjqxvin7wptdiiwoxdqq/lib -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/zstd-1.5.0-ixhjq2kjkwwiubjqtzompy3ovx3xskjy/lib' --with-boot-ldflags='-Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gcc-10.2.0-npcyll4gxjhf4tejksmdzlsl3d3usqpd/lib -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gcc-10.2.0-npcyll4gxjhf4tejksmdzlsl3d3usqpd/lib64 -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/gmp-6.2.1-6d5recuzoijnpzdmyuyatwr32y6e756r/lib -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/mpc-1.1.0-7brtlqfdvz2iwdzeyd23igqlwz3fq4d5/lib -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/mpfr-4.1.0-2gn43ksz5mn4l2ydhukvmf2hc5n6lsu2/lib -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/zlib-1.2.11-bmchsimapzrndjqxvin7wptdiiwoxdqq/lib -Wl,-rpath,/cm/shared/apps/spack/0.17.3/cpu/b/opt/spack/linux-rocky8-zen/gcc-8.5.0/zstd-1.5.0-ixhjq2kjkwwiubjqtzompy3ovx3xskjy/lib -static-libstdc++ -static-libgcc' Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.2.0 (Spack GCC)
Hello Hayk, I tried to use intel-mpi, but thing still not goes well. Something wrong when I compile the codes.
ifort: command line warning #10006: ignoring unknown option '-ffree-line-length-512'
ifort: command line warning #10006: ignoring unknown option '-J'
ifort: warning #10145: no action performed for file 'build/'
I load
module load cpu/0.15.4 intel/19.1.1.217 intel-mpi/2019.8.254 hdf5/1.10.6
and used the default setting.
python3 configure.py -mpi08 -hdf5 --user=user_rad_shock -2d
Is there anything I need reset when compiling? Thanks!
I solve the problem and run the code.
Hi @JiamingZhuge how did you solve the problem? I think it is not related to multicores, Sometimes, first run is Ok with multicores, but in the second run would stuck.
Thanks in advance.
Hi @akbwyfc, I change the mpi from openmpi to other mpi, and it report more details. You are right, the problem I met was not related to multicores. Which step it got stuck? Do you change anything when run the second time?
Hello Hayk, Recently I found that the latest version of tristan-mp-v2 somehow got stuck in the first output when I run in multiple cpus (mpirun -np [>1]). It got stuck after output:
in the terminal. The output diag.log shows something happen after writing parameters:
(stuck here) This problem only occurs when I run in multiple cores in my computer or cpus in cluster, everything goes well when I run in 1 cpu/core (mpirun -np 1). Also I find the old version (maybe a month before, I cloned) can work in multiple cores. I run the code in this command (openmpi):
I try to add debug flag
python3 configure.py -mpi08 -hdf5 --user=user_rad_shock -2d --debug=2
, shows (after modify configure.py line 270 from "debug" to "d0ebug"):Since I don't use intel complier, maybe "complier" flag also need set? And what document should I check for the debug information if works?