Closed ekluzek closed 2 months ago
Tested standalone with this module:
1) ncarenv/23.09 (S) 2) craype/2.7.23 3) cmake/3.26.3 4) intel-oneapi/2024.0.2 5) hdf5/1.14.3 6) netcdf/4.9.2 7) cray-mpich/8.1.27 8) parallel-netcdf/1.12.3 9) ncarcompilers/1.0.0
Compile under sandbox_mizuRoute/route/build/
gmake FC=intel FC_EXE=mpif90 F_MASTER=$BLDDIR NCDF_PATH=$NETCDF PNETCDF_PATH=$PNETCDF MODE=fast EXE=test
Then ran with some test case
under /glade/work/mizukami/test_mizuRoute/HDMA_global ./test settings/HDMA_CLM5-runoff.control
forrtl: error (65): floating invalid
Image PC Routine Line Source
libpthread-2.31.s 00007FEFAE3EF8C0 Unknown Unknown Unknown
libhdf5.so.310.3. 00007FEFA800C654 H5T__init_native_ Unknown Unknown
libhdf5.so.310.3. 00007FEFA7F3FE96 H5T_init Unknown Unknown
libhdf5.so.310.3. 00007FEFA802A679 H5VL_init_phase2 Unknown Unknown
libhdf5.so.310.3. 00007FEFA7D26141 H5_init_library Unknown Unknown
libhdf5.so.310.3. 00007FEFA7DC348C H5Eset_auto2 Unknown Unknown
libnetcdf.so.19 00007FEFAF6A1F6C nc4_hdf5_initiali Unknown Unknown
libnetcdf.so.19 00007FEFAF6AA497 NC_HDF5_initializ Unknown Unknown
libnetcdf.so.19 00007FEFAF62F428 nc_initialize Unknown Unknown
libnetcdf.so.19 00007FEFAF6323C6 NC_open Unknown Unknown
libnetcdf.so.19 00007FEFAF6322B4 nc_open Unknown Unknown
libnetcdff.so.7 00007FEFAFA49511 nf_open_ Unknown Unknown
libnetcdff.so.7 00007FEFAFB0F6EB Unknown Unknown Unknown
libnetcdff.so.7 00007FEFAFAA2725 netcdf_mp_nf90_op Unknown Unknown
test 0000000000417DC3 Unknown Unknown Unknown
test 00000000005072BB Unknown Unknown Unknown
test 0000000000506B43 Unknown Unknown Unknown
test 00000000005067BB Unknown Unknown Unknown
test 0000000000512078 Unknown Unknown Unknown
test 000000000041264D Unknown Unknown Unknown
libc-2.31.so 00007FEFAA03E29D __libc_start_main Unknown Unknown
test 000000000041257A Unknown Unknown Unknown
Aborted (core dumped)
The error occurs when the code is trying to open river input netcdf.
Compilation with debug mode produces even unclear output.... maybe some compilation flag is not correct?
Uninitialized bytes in strlen at offset 0 inside [0x7010000003a0, 1)
==43662==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7fb62149ae2b in MPIDI_CRAY_collopt_process_env (/opt/cray/pe/mpich/8.1.27/ofi/intel/2022.1/lib/libmpi_intel.so.12+0x1dc5e2b) (BuildId: 9050a3fd8814e8f4645b0e5108ad020e92954f4a)
#1 0x7fb62149b45c in MPIDI_Cray_coll_init (/opt/cray/pe/mpich/8.1.27/ofi/intel/2022.1/lib/libmpi_intel.so.12+0x1dc645c) (BuildId: 9050a3fd8814e8f4645b0e5108ad020e92954f4a)
#2 0x7fb6217f1de4 in MPID_Init (/opt/cray/pe/mpich/8.1.27/ofi/intel/2022.1/lib/libmpi_intel.so.12+0x211cde4) (BuildId: 9050a3fd8814e8f4645b0e5108ad020e92954f4a)
#3 0x7fb61fe1ed84 in MPIR_Init_thread (/opt/cray/pe/mpich/8.1.27/ofi/intel/2022.1/lib/libmpi_intel.so.12+0x749d84) (BuildId: 9050a3fd8814e8f4645b0e5108ad020e92954f4a)
#4 0x7fb61fe1eb53 in MPI_Init (/opt/cray/pe/mpich/8.1.27/ofi/intel/2022.1/lib/libmpi_intel.so.12+0x749b53) (BuildId: 9050a3fd8814e8f4645b0e5108ad020e92954f4a)
#5 0x7fb6224995de in pmpi_init__ (/opt/cray/pe/mpich/8.1.27/ofi/intel/2022.1/lib/libmpifort_intel.so.12+0x4d5de) (BuildId: 63521a851ceb7a35393a775072a346557973adee)
#6 0x89c05b in mpi_utils_mp_shr_mpi_init_ /glade/u/home/mizukami/sandbox_mizuRoute/route/build/../build/src/mpi_utils.f90:919:10
#7 0x18db479 in model_setup_mp_init_mpi_ /glade/u/home/mizukami/sandbox_mizuRoute/route/build/../build/src/standalone/model_setup.f90:37:8
#8 0x1989315 in MAIN__ /glade/u/home/mizukami/sandbox_mizuRoute/route/build/../build/src/standalone/route_runoff.f90:57:6
#9 0x418d38 in main (/glade/u/home/mizukami/sandbox_mizuRoute/route/bin/test+0x418d38) (BuildId: 4c39516a27ef6b37be82ead224660dbb57c7fd59)
#10 0x7fb61e35829c in __libc_start_main (/lib64/libc.so.6+0x3529c) (BuildId: c8417d767baccfadb39b474e484d46947915cd8f)
#11 0x418c19 in _start /home/abuild/rpmbuild/BUILD/glibc-2.31/csu/../sysdeps/x86_64/start.S:120
Uninitialized value was created by a heap allocation
#0 0x426616 in malloc (/glade/u/home/mizukami/sandbox_mizuRoute/route/bin/test+0x426616) (BuildId: 4c39516a27ef6b37be82ead224660dbb57c7fd59)
#1 0x7fb62149a0e4 in MPIDI_CRAY_collopt_process_env (/opt/cray/pe/mpich/8.1.27/ofi/intel/2022.1/lib/libmpi_intel.so.12+0x1dc50e4) (BuildId: 9050a3fd8814e8f4645b0e5108ad020e92954f4a)
SUMMARY: MemorySanitizer: use-of-uninitialized-value (/opt/cray/pe/mpich/8.1.27/ofi/intel/2022.1/lib/libmpi_intel.so.12+0x1dc5e2b) (BuildId: 9050a3fd8814e8f4645b0e5108ad020e92954f4a) in MPIDI_CRAY_collopt_process_env
Exiting
when I use intel-oneapi/2023.2.1
, which is default on Derecho now, I cannot get it compiled. The compilation error is below. I don't see what is wrong with the code....
#0 0x00000000021c51e2
#1 0x0000000002228e97
#2 0x0000000002228e66
#3 0x00000000022b34bd
#4 0x0000000002299cd3
#5 0x00000000022b4a57
#6 0x00000000022a844c
#7 0x00000000022a8060
#8 0x00000000022c92eb
#9 0x00000000022c6602
#10 0x00000000022c5d4b
#11 0x0000000002278113
#12 0x000000000226ee01
#13 0x000000000226cf83
#14 0x0000000002277705
#15 0x0000000002277ce4
#16 0x00000000021fec79
#17 0x00000000021fe8a0
#18 0x00000000021fea4d
#19 0x00000000021ff14c
#20 0x0000000002277705
#21 0x0000000002277ce4
#22 0x00000000021fec79
#23 0x00000000021fe8a0
#24 0x00000000021fea4d
#25 0x00000000021ff14c
#26 0x0000000002277705
#27 0x0000000002277ce4
#28 0x0000000002274ce6
#29 0x0000000002277705
#30 0x0000000002277ce4
#31 0x000000000227a159
#32 0x0000000002277705
#33 0x0000000002277ce4
#34 0x00000000022752b2
#35 0x0000000002277705
#36 0x000000000227495a
#37 0x0000000002277705
#38 0x0000000002111c05
#39 0x00000000021115bd
#40 0x00000000022e13ce
#41 0x00007fd08610329d __libc_start_main + 239
#42 0x0000000001f51aa9
/glade/u/home/mizukami/sandbox_mizuRoute/route/build/../build/src/csv_data.f90(321): error #5623: **Internal compiler error: internal abort** Please report this error along with the circumstances in which it occurred in a Software Problem Report. Note: File and line given may not be explicit cause of this error.
csv_data(i,j) = this%csv_data(i,j)%str
----------^
compilation aborted for /glade/u/home/mizukami/sandbox_mizuRoute/route/build/../build/src/csv_data.f90 (code 3)
Looks like the code is compiled with intel-oneapi/2023.2.1. These modules are loaded for compiling and running the exe
module load intel-oneapi
module load cray-mpich
module load craype
module load ncarcompilers
module load netcdf
module load parallel-netcdf
This version of compiler does not like do concurrent loops in [csv_data.f90], (https://github.com/ESCOMP/mizuRoute/blob/a9da911a8d9e88ddc7e3713bd451d2c13cc1b173/route/build/src/csv_data.f90#L319). This causes the compiling error I posted above. If I change these to regular do loops, it is compiled. I am not sure if this is compiler bug??
If I use intel-oneapi/2024.0.2, I am not able to link netcdf correctly. it is compiled, but runtime error (cannot open netCDF)
We need to transition mizuRoute testslist on Derecho from intel to intel-oneapi
See this CTSM issue for more details: https://github.com/ESCOMP/CTSM/issues/2476