Closed pkufubo closed 1 year ago
my sbatch script.
#!/bin/bash
#SBATCH -J FB_GC
#SBATCH -p C032M0128G
#SBATCH --qos=high
#SBATCH -N 1
#SBATCH -n 32
#SBATCH -o log.%j
#SBATCH -e log.%j
#SBATCH --mail-type=END
#SBATCH --mail-user=pkufubo@pku.edu.cn
#SBATCH -o job.%j.out
ulimit -s unlimited
export OMP_STACKSIZE=1G
export OMP_NUM_THREADS=64
module purge
module load intel/2013.1
module load cmake/3.16.0
module load impi/2017.1
module load netcdf/c/4.6.1-intel-2013.1
module load netcdf/fortran/4.4.4-intel-2013.1
module load netcdf/4.4.1-intel-2013.0
module load hdf5/1.8.19-intel-2013.1
module load gcc/7.2.0
#modify by user
#srun -N 1 -n $OMP_NUM_THREADS -t 30-24:00 -p cpu_single ./gcclassic >> GC.log
mpirun -n 1 -env OMP_NUM_THREADS=64 ./gcclassic >> GC.log
Thanks for writing @pkufubo. You should not use mpirun
to run GEOS-Chem Classic, that is only needed if you are running GCHP. Use the srun
command instead (the one you have listed above.)
lso use #SBATCH -c 32
and export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
in the run script to set the number of cores properly.
Also I am going to move this comment to geoschem/geos-chem, as this issue tracker is for issues pertaining to the GCClassic superproject wrapper.
Thanks for writing @pkufubo. You should not use
mpirun
to run GEOS-Chem Classic, that is only needed if you are running GCHP. Use thesrun
command instead (the one you have listed above.)lso use
#SBATCH -c 32
andexport OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
in the run script to set the number of cores properly.
Thank you for your reponce @yantosca. I have update the run script following your suggestion.
#!/bin/bash
#SBATCH -J FB_GC
#SBATCH -p C032M0128G
#SBATCH --qos=high
#SBATCH -N 1
#SBATCH -c 32
#SBATCH -o log.%j
#SBATCH -e log.%j
#SBATCH --mail-type=END
#SBATCH --mail-user=pkufubo@pku.edu.cn
#SBATCH -o job.%j.out
ulimit -s unlimited
export OMP_STACKSIZE=1G
export OMP_NUM_THREADS=64
module purge
module load intel/2013.1
module load cmake/3.16.0
module load impi/2017.1
module load netcdf/c/4.6.1-intel-2013.1
module load netcdf/fortran/4.4.4-intel-2013.1
module load netcdf/4.4.1-intel-2013.0
module load hdf5/1.8.19-intel-2013.1
module load gcc/7.2.0
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
#modify by user
srun ./gcclassic >> GC.log
Now no Error in GC.log. However, it seems not end perfectly. The last several rows in GC.log are listed below:
---> DATE: 2005/07/31 UTC: 23:50 X-HRS: 743.833313
### MAIN: a SET_CURRENT_TIME
### MAIN: a HEMCO PHASE 1
### MAIN: a INTERP, etc
### MAIN: a DO_TRANSPORT
### MAIN: a SETUP_WETSCAV
### MAIN: a COMPUTE_PBL_HEIGHT
### Species Unit Conversion: kg/kg dry -> v/v dry ###
### Species Unit Conversion: v/v dry -> kg/kg dry ###
### MAIN: a Compute_Sflx_For_Vdiff
### Species Unit Conversion: kg/kg dry -> v/v dry ###
### DO_PBL_MIX_2: after VDIFFDR
### DO_PBL_MIX_2: after AIRQNT
### Species Unit Conversion: v/v dry -> kg/kg dry ###
### Species Unit Conversion: kg/kg dry -> kg/m2 ###
### Species Unit Conversion: kg/m2 -> kg/kg dry ###
### MAIN: a TURBDAY:2
### MAIN: a CONVECTION
- Updating collection: CH4
- Updating collection: SpeciesConc
- Updating collection: Metrics
- Updating collection: Restart
### MAIN: after SET_ELAPSED_SEC
### MAIN: after Planeflight
### MAIN: after COPY_I3_FIELDS
- Creating file for CH4; reference = 20050701 000000
with filename = OutputDir/GEOSChem.CH4.20050701_0000z.nc4
- Writing data to CH4; timestamp = 0.0000
- Creating file for SpeciesConc; reference = 20050701 000000
with filename = OutputDir/GEOSChem.SpeciesConc.20050701_0000z.nc4
- Writing data to SpeciesConc; timestamp = 0.0000
- Creating file for Metrics; reference = 20050701 000000
with filename = OutputDir/GEOSChem.Metrics.20050701_0000z.nc4
- Writing data to Metrics; timestamp = 0.0000
- Creating file for Restart; reference = 20050801 000000
with filename = ./Restarts/GEOSChem.Restart.20050801_0000z.nc4
- Writing data to Restart; timestamp = 0.0000
---> DATE: 2005/08/01 UTC: 00:00 X-HRS: 744.000000
### MAIN: a SET_CURRENT_TIME
### MAIN: a HEMCO PHASE 1
### MAIN: a CLOSE_FILES
"GC.log" 162546L, 7081465C
And if I run metrics.py. It reports:
GEOS-Chem METHANE SIMULATION METRICS
Simulation start : 2005-07-01 00:00:00z
Simulation end : 2005-08-01 00:00:00z
==============================================================================
Mass-weighted mean OH concentration = nan x 10^5 molec cm-3
CH3CCl3 lifetime w/r/t tropospheric OH = nan years
CH4 lifetime w/r/t tropospheric OH = nan years
CH4 total lifetime (full atmosphere) = nan years
CH4 total lifetime (troposphere only) = nan years
cdo info GEOSChem.CH4.20050701_0000z.nc4
Warning (cdf_check_variables): Number of time steps undefined, skipped variable LossCH4inStrat!
Warning (cdf_check_variables): Number of time steps undefined, skipped variable LossCH4byOHinTrop!
Warning (cdf_check_variables): Number of time steps undefined, skipped variable LossCH4byClinTrop!
Warning (cdf_check_variables): Number of time steps undefined, skipped variable OHconcAfterChem!
-1 : Date Time Level Gridsize Miss : Minimum Mean Maximum : Parameter ID
1 : 0000-00-00 00:00:00 0 3312 0 : 9.9692e+36 9.9692e+36 9.9692e+36 : -1
cdo info: Processed 3312 values from 1 variable over 1 timestep [0.07s 22MB]
*** Error in `/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic': double free or corruption (!prev): 0x00000000145cc9c0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81679)[0x2af4fbdac679]
/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic(for_dealloc_allocatable+0xfc)[0xe0b97c]
/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic[0xb43010]
/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic[0x40d5ea]
/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic[0x40af56]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2af4fbd4d505]
/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic[0x40ae59]
======= Memory map: ========
00400000-0104b000 r-xp 00000000 00:28 880203251 /gpfs/share/home/2106393205/Output/GCAP_test/gcclassic
0124a000-0124d000 r--p 00c4a000 00:28 880203251 /gpfs/share/home/2106393205/Output/GCAP_test/gcclassic
0124d000-01399000 rw-p 00c4d000 00:28 880203251 /gpfs/share/home/2106393205/Output/GCAP_test/gcclassic
01399000-12453000 rw-p 00000000 00:00 0
13902000-285e9000 rw-p 00000000 00:00 0 [heap]
2af4f78db000-2af4f78fd000 r-xp 00000000 fd:00 66930 /usr/lib64/ld-2.17.so
2af4f78fd000-2af4f79e2000 rw-p 00000000 00:00 0
2af4f79e2000-2af4f79e3000 ---p 00000000 00:00 0
2af4f79e3000-2af4f7af4000 rw-p 00000000 00:00 0
2af4f7afc000-2af4f7afd000 r--p 00021000 fd:00 66930 /usr/lib64/ld-2.17.so
2af4f7afd000-2af4f7afe000 rw-p 00022000 fd:00 66930 /usr/lib64/ld-2.17.so
2af4f7afe000-2af4f7aff000 rw-p 00000000 00:00 0
2af4f7aff000-2af4f7bbd000 r-xp 00000000 00:28 222406936 /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdff.so.6.0.1
2af4f7bbd000-2af4f7dbc000 ---p 000be000 00:28 222406936 /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdff.so.6.0.1
2af4f7dbc000-2af4f7dbd000 r--p 000bd000 00:28 222406936 /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdff.so.6.0.1
2af4f7dbd000-2af4f7dbe000 rw-p 000be000 00:28 222406936 /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdff.so.6.0.1
2af4f7dbe000-2af4f8125000 rw-p 00000000 00:00 0
2af4f8125000-2af4f828e000 r-xp 00000000 00:28 222406724 /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdf.so.11.0.3
2af4f828e000-2af4f848d000 ---p 00169000 00:28 222406724 /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdf.so.11.0.3
2af4f848d000-2af4f84e0000 r--p 00168000 00:28 222406724 /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdf.so.11.0.3
2af4f84e0000-2af4f84e4000 rw-p 001bb000 00:28 222406724 /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdf.so.11.0.3
2af4f84e4000-2af4fb4f5000 rw-p 00000000 00:00 0
2af4fb4f5000-2af4fb5f6000 r-xp 00000000 fd:00 74177 /usr/lib64/libm-2.17.so
2af4fb5f6000-2af4fb7f5000 ---p 00101000 fd:00 74177 /usr/lib64/libm-2.17.so
2af4fb7f5000-2af4fb7f6000 r--p 00100000 fd:00 74177 /usr/lib64/libm-2.17.so
2af4fb7f6000-2af4fb7f7000 rw-p 00101000 fd:00 74177 /usr/lib64/libm-2.17.so
2af4fb7f7000-2af4fb8dc000 r-xp 00000000 00:28 175641870 /gpfs/share/software/intel/2013.1/composer_xe_2013_sp1.2.144/compiler/lib/intel64/libiomp5.so
2af4fb8dc000-2af4fbadc000 ---p 000e5000 00:28 175641870 /gpfs/share/software/intel/2013.1/composer_xe_2013_sp1.2.144/compiler/lib/intel64/libiomp5.so
2af4fbadc000-2af4fbae7000 rw-p 000e5000 00:28 175641870 /gpfs/share/software/intel/2013.1/composer_xe_2013_sp1.2.144/compiler/lib/intel64/libiomp5.so
2af4fbae7000-2af4fbb0f000 rw-p 00000000 00:00 0
2af4fbb0f000-2af4fbb26000 r-xp 00000000 fd:00 74195 /usr/lib64/libpthread-2.17.so
2af4fbb26000-2af4fbd25000 ---p 00017000 fd:00 74195 /usr/lib64/libpthread-2.17.so
2af4fbd25000-2af4fbd26000 r--p 00016000 fd:00 74195 /usr/lib64/libpthread-2.17.so
2af4fbd26000-2af4fbd27000 rw-p 00017000 fd:00 74195 /usr/lib64/libpthread-2.17.so
2af4fbd27000-2af4fbd2b000 rw-p 00000000 00:00 0
2af4fbd2b000-2af4fbeee000 r-xp 00000000 fd:00 66937 /usr/lib64/libc-2.17.so
2af4fbeee000-2af4fc0ee000 ---p 001c3000 fd:00 66937 /usr/lib64/libc-2.17.so
2af4fc0ee000-2af4fc0f2000 r--p 001c3000 fd:00 66937 /usr/lib64/libc-2.17.so
Name and Institution (Required)
Name: Bo Fu Institution: Peking University
Confirm you have reviewed the following documentation
Description of your issue or question
Please provide as much detail as possible. Always include the GCClassic version number and any relevant configuration and log files.
GCClassic version number: 14.0.1 Hello. I am running CH4 simulation with GCAP data driving. However, I met an error like this (Error in `./gcclassic': double free or corruption (!prev)). NC outputs like GEOSChem.CH4.20050701_0000z.nc4 can be seen in OutputDir, but the time axis is 0. I don't change any source code and config files. I just download the Extdata with download_data.py and submit to the HPC of Peking University. I don't know why there is this bug. Thanks for your supports in advance!
geoschem_config.yml