Open rsdunlapiv opened 5 years ago
The following test fails while CLM is trying to write history files:
/glade/scratch/dunlap/ERS_Vnuopc_Ln5.f45_f45_mg37.I2000Clm50SpNuopc.cheyenne_intel.clm-nuopc_cap.GC.20190423_082340_rewub1/run 73:cesm.exe: ad_gpfs_wrcoll.c:834: ADIOI_Exch_and_write: Assertion `(off + size - req_off) == (int)(off + size - req_off)' failed. 73:MPT ERROR: Rank 73(g:73) received signal SIGABRT/SIGIOT(6). 73: Process ID: 65813, Host: r9i6n2, Program: /gpfs/fs1/scratch/dunlap/ERS_Vnuopc_Ln5.f45_f45_mg37.I2000Clm50SpNuopc.cheyenne_intel.clm-nuopc_cap.GC.20190423_082340_rewub1/bld/cesm.exe 73: MPT Version: HPE MPT 2.19 12/07/18 05:31:15 73: 73:MPT: --------stack traceback------- 73:MPT: Attaching to program: /proc/65813/exe, process 65813 73:MPT: done. 73:MPT: Try: zypper install -C "debuginfo(build-id)=3d290be00d48b823d3b71df2249e80d881bc473d" 73:MPT: (no debugging symbols found)...done. 73:MPT: Try: zypper install -C "debuginfo(build-id)=0ea764119690f32c98faae9a63a73f35ed8b1099" 73:MPT: (no debugging symbols found)...done. 73:MPT: Try: zypper install -C "debuginfo(build-id)=79264652a62453da222372a430cd9351d4bbcbde" 73:MPT: (no debugging symbols found)...done. 73:MPT: Try: zypper install -C "debuginfo(build-id)=5409c48fdb15e90649c1407e444fbe31d6dc8ec1" 73:MPT: (no debugging symbols found)...done. 73:MPT: [Thread debugging using libthread_db enabled] 73:MPT: Using host libthread_db library "/glade/u/apps/ch/os/lib64/libthread_db.so.1". 73:MPT: Try: zypper install -C "debuginfo(build-id)=3a453a18f06ae88bd1b8146bf2ae8fcae5c4c203" 73:MPT: (no debugging symbols found)...done. 73:MPT: Try: zypper install -C "debuginfo(build-id)=f43d7754940a14ffe3d9bd8fc9472ffbbfead544" 73:MPT: (no debugging symbols found)...done. 73:MPT: Try: zypper install -C "debuginfo(build-id)=e97cfdb062d6f0c41073f2109a7605d0ae991c03" 73:MPT: (no debugging symbols found)...done. 73:MPT: Try: zypper install -C "debuginfo(build-id)=15916519d9dbaea26ec88427460b4cedb9c0a6ab" 73:MPT: (no debugging symbols found)...done. 73:MPT: Try: zypper install -C "debuginfo(build-id)=4c08f43bb18e99a7df4bad5c4a52bac67ddf9b8d" 73:MPT: (no debugging symbols found)...done. 73:MPT: Try: zypper install -C "debuginfo(build-id)=3ae04b58bd81ea7745dba789d89937e719309568" 73:MPT: (no debugging symbols found)...done. 73:MPT: 0x00002aaab83c841c in waitpid () from /glade/u/apps/ch/os/lib64/libpthread.so.0 73:MPT: Missing separate debuginfos, use: zypper install glibc-debuginfo-2.19-35.1.x86_64 73:MPT: (gdb) #0 0x00002aaab83c841c in waitpid () 73:MPT: from /glade/u/apps/ch/os/lib64/libpthread.so.0 73:MPT: #1 0x00002aaab8b08e66 in mpi_sgi_system ( 73:MPT: #2 MPI_SGI_stacktraceback ( 73:MPT: header=header@entry=0x7ffffffd85c0 "MPT ERROR: Rank 73(g:73) received signal SIGABRT/SIGIOT(6).\n\tProcess ID: 65813, Host: r9i6n2, Program: /gpfs/fs1/scratch/dunlap/ERS_Vnuopc_Ln5.f45_f45_mg37.I2000Clm50SpNuopc.cheyenne_intel.clm-nuopc_c"...) at sig.c:340 73:MPT: #3 0x00002aaab8b09062 in first_arriver_handler (signo=signo@entry=6, 73:MPT: stack_trace_sem=stack_trace_sem@entry=0x2aaac5be0080) at sig.c:489 73:MPT: #4 0x00002aaab8b093fb in slave_sig_handler (signo=6, siginfo=<optimized out>, 73:MPT: extra=<optimized out>) at sig.c:564 73:MPT: #5 <signal handler called> 73:MPT: #6 0x00002aaab93a30c7 in raise () from /glade/u/apps/ch/os/lib64/libc.so.6 73:MPT: #7 0x00002aaab93a4478 in abort () from /glade/u/apps/ch/os/lib64/libc.so.6 73:MPT: #8 0x00002aaab939c146 in __assert_fail_base () 73:MPT: from /glade/u/apps/ch/os/lib64/libc.so.6 73:MPT: #9 0x00002aaab939c1f2 in __assert_fail () 73:MPT: from /glade/u/apps/ch/os/lib64/libc.so.6 73:MPT: #10 0x00002aaab8b435be in ADIOI_Exch_and_write (error_code=0x7ffffffd96b0, 73:MPT: buf_idx=0x8280000, fd_end=0x8270000, fd_start=0x8260000, fd_size=8231664, 73:MPT: min_st_offset=156944, contig_access_count=12582, len_list=0x19690000, 73:MPT: offset_list=0x19610000, others_req=0x196f0000, myrank=0, nprocs=2, 73:MPT: datatype=27, buf=0x8558270, fd=0x95bd140) at ad_gpfs_wrcoll.c:834 73:MPT: #11 ADIOI_GPFS_WriteStridedColl (fd=0x95bd140, buf=0x8558270, count=2007552, 73:MPT: datatype=27, file_ptr_type=<optimized out>, offset=<optimized out>, 73:MPT: status=0x7ffffffd9750, error_code=0x7ffffffd96b0) at ad_gpfs_wrcoll.c:468 73:MPT: #12 0x00002aaab8b76d63 in MPIOI_File_write_all (fh=<optimized out>, 73:MPT: offset=62976, file_ptr_type=file_ptr_type@entry=100, buf=<optimized out>, 73:MPT: count=2007552, datatype=27, 73:MPT: myname=myname@entry=0x2aaab8dba3c0 <myname.15116> "MPI_FILE_WRITE_AT_ALL", 73:MPT: status=0x7ffffffd9750) at write_all.c:125 73:MPT: #13 0x00002aaab8b77397 in PMPI_File_write_at_all (fh=<optimized out>, 73:MPT: offset=<optimized out>, buf=<optimized out>, count=<optimized out>, 73:MPT: datatype=<optimized out>, status=<optimized out>) at write_atall.c:64 73:MPT: #14 0x00000000010b9385 in ncmpio_read_write () 73:MPT: #15 0x0000000001093384 in req_aggregation () 73:MPT: #16 0x0000000001090dc6 in wait_getput () 73:MPT: #17 0x000000000108eccc in req_commit () 73:MPT: #18 0x0000000000ffbbd2 in ncmpi_wait_all () 73:MPT: #19 0x0000000000f30901 in flush_output_buffer () 73:MPT: at /gpfs/u/home/dunlap/UFSCOMP.apr16/cime/src/externals/pio2/src/clib/pio_darray_int.c:1751 73:MPT: #20 0x0000000000f2a32a in PIOc_write_darray_multi ()
The following test fails while CLM is trying to write history files: