geoschem / geos-chem

GEOS-Chem "Science Codebase" repository. Contains GEOS-Chem science routines, run directory generation scripts, and interface code. This repository is used as a submodule within the GCClassic and GCHP wrappers, as well as in other modeling contexts (external ESMs).
http://geos-chem.org
Other
164 stars 157 forks source link

[question/bug] CH4 simulation with GCAP run #1734

Closed pkufubo closed 1 year ago

pkufubo commented 1 year ago

Name and Institution (Required)

Name: Bo Fu Institution: Peking University

Confirm you have reviewed the following documentation

Description of your issue or question

Please provide as much detail as possible. Always include the GCClassic version number and any relevant configuration and log files.

GCClassic version number: 14.0.1 Hello. I am running CH4 simulation with GCAP data driving. However, I met an error like this (Error in `./gcclassic': double free or corruption (!prev)). NC outputs like GEOSChem.CH4.20050701_0000z.nc4 can be seen in OutputDir, but the time axis is 0. I don't change any source code and config files. I just download the Extdata with download_data.py and submit to the HPC of Peking University. I don't know why there is this bug. Thanks for your supports in advance!


geoschem_config.yml

simulation:
  name: CH4
  start_date: [20050701, 000000]
  end_date: [20050801, 000000]
  root_data_dir: /gpfs/share/home/2106393205/ExtData
  met_field: ModelE2.1
  species_database_file: ./species_database.yml
  species_metadata_output_file: OutputDir/geoschem_species_metadata.yml
  debug_printout: true
  use_gcclassic_timers: false
GC.log
---> DATE: 2005/07/31  UTC: 23:50  X-HRS:    743.833313
     ### MAIN: a SET_CURRENT_TIME
     ### MAIN: a HEMCO PHASE 1
     ### MAIN: a INTERP, etc
     ### MAIN: a DO_TRANSPORT
     ### MAIN: a SETUP_WETSCAV
     ### MAIN: a COMPUTE_PBL_HEIGHT
     ### Species Unit Conversion: kg/kg dry -> v/v dry ###
     ### Species Unit Conversion: v/v dry -> kg/kg dry ###
     ### MAIN: a Compute_Sflx_For_Vdiff
     ### Species Unit Conversion: kg/kg dry -> v/v dry ###
     ### DO_PBL_MIX_2: after VDIFFDR
     ### DO_PBL_MIX_2: after AIRQNT
     ### Species Unit Conversion: v/v dry -> kg/kg dry ###
     ### Species Unit Conversion: kg/kg dry -> kg/m2 ###
     ### Species Unit Conversion: kg/m2 -> kg/kg dry ###
     ### MAIN: a TURBDAY:2
     ### MAIN: a CONVECTION
     - Updating collection: CH4
     - Updating collection: SpeciesConc
     - Updating collection: Metrics
     - Updating collection: Restart
     ### MAIN: after SET_ELAPSED_SEC
     ### MAIN: after Planeflight
     ### MAIN: after COPY_I3_FIELDS
     - Creating file for CH4; reference = 20050701 000000
        with filename = OutputDir/GEOSChem.CH4.20050701_0000z.nc4
     - Writing data to CH4; timestamp =        0.0000
     - Creating file for SpeciesConc; reference = 20050701 000000
        with filename = OutputDir/GEOSChem.SpeciesConc.20050701_0000z.nc4
     - Writing data to SpeciesConc; timestamp =        0.0000
     - Creating file for Metrics; reference = 20050701 000000
        with filename = OutputDir/GEOSChem.Metrics.20050701_0000z.nc4
     - Writing data to Metrics; timestamp =        0.0000
     - Creating file for Restart; reference = 20050801 000000
        with filename = ./Restarts/GEOSChem.Restart.20050801_0000z.nc4
     - Writing data to Restart; timestamp =        0.0000
---> DATE: 2005/08/01  UTC: 00:00  X-HRS:    744.000000
     ### MAIN: a SET_CURRENT_TIME
     ### MAIN: a HEMCO PHASE 1
*** Error in `./gcclassic': double free or corruption (!prev): 0x0000000013c959c0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81679)[0x2b1c80c6a679]
./gcclassic(for_dealloc_allocatable+0xfc)[0xe0b97c]
./gcclassic[0xb43010]
./gcclassic[0x40d5ea]
./gcclassic[0x40af56]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b1c80c0b505]
./gcclassic[0x40ae59]
======= Memory map: ========
00400000-0104b000 r-xp 00000000 00:28 880203251                          /gpfs/share/home/2106393205/Output/GCAP_test/gcclassic
0124a000-0124d000 r--p 00c4a000 00:28 880203251                          /gpfs/share/home/2106393205/Output/GCAP_test/gcclassic
0124d000-01399000 rw-p 00c4d000 00:28 880203251                          /gpfs/share/home/2106393205/Output/GCAP_test/gcclassic
01399000-12453000 rw-p 00000000 00:00 0
12fcb000-27d9b000 rw-p 00000000 00:00 0                                  [heap]
2b1c7c799000-2b1c7c7bb000 r-xp 00000000 fd:00 66930                      /usr/lib64/ld-2.17.so
2b1c7c7bb000-2b1c7c7bd000 r-xp 00000000 00:00 0                          [vdso]
2b1c7c7bd000-2b1c7c8a2000 rw-p 00000000 00:00 0
2b1c7c8a2000-2b1c7c8a3000 ---p 00000000 00:00 0
2b1c7c8a3000-2b1c7c9b4000 rw-p 00000000 00:00 0
2b1c7c9ba000-2b1c7c9bb000 r--p 00021000 fd:00 66930                      /usr/lib64/ld-2.17.so
2b1c7c9bb000-2b1c7c9bc000 rw-p 00022000 fd:00 66930                      /usr/lib64/ld-2.17.so
----
pkufubo commented 1 year ago

my sbatch script.

#!/bin/bash
#SBATCH -J FB_GC
#SBATCH -p C032M0128G
#SBATCH --qos=high
#SBATCH -N 1
#SBATCH -n 32
#SBATCH -o log.%j
#SBATCH -e log.%j
#SBATCH --mail-type=END
#SBATCH --mail-user=pkufubo@pku.edu.cn
#SBATCH -o job.%j.out

ulimit -s unlimited
export OMP_STACKSIZE=1G
export OMP_NUM_THREADS=64
module purge
module load intel/2013.1
module load cmake/3.16.0
module load impi/2017.1

module load netcdf/c/4.6.1-intel-2013.1
module load netcdf/fortran/4.4.4-intel-2013.1
module load netcdf/4.4.1-intel-2013.0
module load hdf5/1.8.19-intel-2013.1

module load gcc/7.2.0

#modify by user
#srun -N 1 -n $OMP_NUM_THREADS -t 30-24:00 -p cpu_single ./gcclassic >> GC.log
mpirun -n 1 -env OMP_NUM_THREADS=64 ./gcclassic >> GC.log
yantosca commented 1 year ago

Thanks for writing @pkufubo. You should not use mpirun to run GEOS-Chem Classic, that is only needed if you are running GCHP. Use the srun command instead (the one you have listed above.)

lso use #SBATCH -c 32 and export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK in the run script to set the number of cores properly.

yantosca commented 1 year ago

Also I am going to move this comment to geoschem/geos-chem, as this issue tracker is for issues pertaining to the GCClassic superproject wrapper.

pkufubo commented 1 year ago

Thanks for writing @pkufubo. You should not use mpirun to run GEOS-Chem Classic, that is only needed if you are running GCHP. Use the srun command instead (the one you have listed above.)

lso use #SBATCH -c 32 and export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK in the run script to set the number of cores properly.

Thank you for your reponce @yantosca. I have update the run script following your suggestion.

#!/bin/bash
#SBATCH -J FB_GC
#SBATCH -p C032M0128G
#SBATCH --qos=high
#SBATCH -N 1
#SBATCH -c 32
#SBATCH -o log.%j
#SBATCH -e log.%j
#SBATCH --mail-type=END
#SBATCH --mail-user=pkufubo@pku.edu.cn
#SBATCH -o job.%j.out

ulimit -s unlimited
export OMP_STACKSIZE=1G
export OMP_NUM_THREADS=64
module purge
module load intel/2013.1
module load cmake/3.16.0
module load impi/2017.1

module load netcdf/c/4.6.1-intel-2013.1
module load netcdf/fortran/4.4.4-intel-2013.1
module load netcdf/4.4.1-intel-2013.0
module load hdf5/1.8.19-intel-2013.1

module load gcc/7.2.0
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

#modify by user
srun   ./gcclassic >> GC.log

Now no Error in GC.log. However, it seems not end perfectly. The last several rows in GC.log are listed below:

---> DATE: 2005/07/31  UTC: 23:50  X-HRS:    743.833313
     ### MAIN: a SET_CURRENT_TIME
     ### MAIN: a HEMCO PHASE 1
     ### MAIN: a INTERP, etc
     ### MAIN: a DO_TRANSPORT
     ### MAIN: a SETUP_WETSCAV
     ### MAIN: a COMPUTE_PBL_HEIGHT
     ### Species Unit Conversion: kg/kg dry -> v/v dry ###
     ### Species Unit Conversion: v/v dry -> kg/kg dry ###
     ### MAIN: a Compute_Sflx_For_Vdiff
     ### Species Unit Conversion: kg/kg dry -> v/v dry ###
     ### DO_PBL_MIX_2: after VDIFFDR
     ### DO_PBL_MIX_2: after AIRQNT
     ### Species Unit Conversion: v/v dry -> kg/kg dry ###
     ### Species Unit Conversion: kg/kg dry -> kg/m2 ###
     ### Species Unit Conversion: kg/m2 -> kg/kg dry ###
     ### MAIN: a TURBDAY:2
     ### MAIN: a CONVECTION
     - Updating collection: CH4
     - Updating collection: SpeciesConc
     - Updating collection: Metrics
     - Updating collection: Restart
     ### MAIN: after SET_ELAPSED_SEC
     ### MAIN: after Planeflight
     ### MAIN: after COPY_I3_FIELDS
     - Creating file for CH4; reference = 20050701 000000
        with filename = OutputDir/GEOSChem.CH4.20050701_0000z.nc4
     - Writing data to CH4; timestamp =        0.0000
     - Creating file for SpeciesConc; reference = 20050701 000000
        with filename = OutputDir/GEOSChem.SpeciesConc.20050701_0000z.nc4
     - Writing data to SpeciesConc; timestamp =        0.0000
     - Creating file for Metrics; reference = 20050701 000000
        with filename = OutputDir/GEOSChem.Metrics.20050701_0000z.nc4
     - Writing data to Metrics; timestamp =        0.0000
     - Creating file for Restart; reference = 20050801 000000
        with filename = ./Restarts/GEOSChem.Restart.20050801_0000z.nc4
     - Writing data to Restart; timestamp =        0.0000
---> DATE: 2005/08/01  UTC: 00:00  X-HRS:    744.000000
     ### MAIN: a SET_CURRENT_TIME
     ### MAIN: a HEMCO PHASE 1
     ### MAIN: a CLOSE_FILES
"GC.log" 162546L, 7081465C 

And if I run metrics.py. It reports:

GEOS-Chem METHANE SIMULATION METRICS

Simulation start : 2005-07-01 00:00:00z
Simulation end   : 2005-08-01 00:00:00z
==============================================================================

Mass-weighted mean OH concentration    = nan x 10^5 molec cm-3

CH3CCl3 lifetime w/r/t tropospheric OH = nan years

CH4 lifetime w/r/t tropospheric OH     = nan years

CH4 total lifetime (full atmosphere)   = nan years

CH4 total lifetime (troposphere only)  = nan years
pkufubo commented 1 year ago

cdo info GEOSChem.CH4.20050701_0000z.nc4

Warning (cdf_check_variables): Number of time steps undefined, skipped variable LossCH4inStrat!
Warning (cdf_check_variables): Number of time steps undefined, skipped variable LossCH4byOHinTrop!
Warning (cdf_check_variables): Number of time steps undefined, skipped variable LossCH4byClinTrop!
Warning (cdf_check_variables): Number of time steps undefined, skipped variable OHconcAfterChem!
    -1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter ID
     1 : 0000-00-00 00:00:00       0     3312       0 :  9.9692e+36  9.9692e+36  9.9692e+36 : -1
cdo    info: Processed 3312 values from 1 variable over 1 timestep [0.07s 22MB]
*** Error in `/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic': double free or corruption (!prev): 0x00000000145cc9c0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81679)[0x2af4fbdac679]
/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic(for_dealloc_allocatable+0xfc)[0xe0b97c]
/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic[0xb43010]
/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic[0x40d5ea]
/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic[0x40af56]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x2af4fbd4d505]
/gpfs/share/home/2106393205/Output/GCAP_test/./gcclassic[0x40ae59]
======= Memory map: ========
00400000-0104b000 r-xp 00000000 00:28 880203251                          /gpfs/share/home/2106393205/Output/GCAP_test/gcclassic
0124a000-0124d000 r--p 00c4a000 00:28 880203251                          /gpfs/share/home/2106393205/Output/GCAP_test/gcclassic
0124d000-01399000 rw-p 00c4d000 00:28 880203251                          /gpfs/share/home/2106393205/Output/GCAP_test/gcclassic
01399000-12453000 rw-p 00000000 00:00 0
13902000-285e9000 rw-p 00000000 00:00 0                                  [heap]
2af4f78db000-2af4f78fd000 r-xp 00000000 fd:00 66930                      /usr/lib64/ld-2.17.so
2af4f78fd000-2af4f79e2000 rw-p 00000000 00:00 0
2af4f79e2000-2af4f79e3000 ---p 00000000 00:00 0
2af4f79e3000-2af4f7af4000 rw-p 00000000 00:00 0
2af4f7afc000-2af4f7afd000 r--p 00021000 fd:00 66930                      /usr/lib64/ld-2.17.so
2af4f7afd000-2af4f7afe000 rw-p 00022000 fd:00 66930                      /usr/lib64/ld-2.17.so
2af4f7afe000-2af4f7aff000 rw-p 00000000 00:00 0
2af4f7aff000-2af4f7bbd000 r-xp 00000000 00:28 222406936                  /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdff.so.6.0.1
2af4f7bbd000-2af4f7dbc000 ---p 000be000 00:28 222406936                  /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdff.so.6.0.1
2af4f7dbc000-2af4f7dbd000 r--p 000bd000 00:28 222406936                  /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdff.so.6.0.1
2af4f7dbd000-2af4f7dbe000 rw-p 000be000 00:28 222406936                  /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdff.so.6.0.1
2af4f7dbe000-2af4f8125000 rw-p 00000000 00:00 0
2af4f8125000-2af4f828e000 r-xp 00000000 00:28 222406724                  /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdf.so.11.0.3
2af4f828e000-2af4f848d000 ---p 00169000 00:28 222406724                  /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdf.so.11.0.3
2af4f848d000-2af4f84e0000 r--p 00168000 00:28 222406724                  /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdf.so.11.0.3
2af4f84e0000-2af4f84e4000 rw-p 001bb000 00:28 222406724                  /gpfs/share/software/netcdf/4.4.1/intel/2013.0.1/lib/libnetcdf.so.11.0.3
2af4f84e4000-2af4fb4f5000 rw-p 00000000 00:00 0
2af4fb4f5000-2af4fb5f6000 r-xp 00000000 fd:00 74177                      /usr/lib64/libm-2.17.so
2af4fb5f6000-2af4fb7f5000 ---p 00101000 fd:00 74177                      /usr/lib64/libm-2.17.so
2af4fb7f5000-2af4fb7f6000 r--p 00100000 fd:00 74177                      /usr/lib64/libm-2.17.so
2af4fb7f6000-2af4fb7f7000 rw-p 00101000 fd:00 74177                      /usr/lib64/libm-2.17.so
2af4fb7f7000-2af4fb8dc000 r-xp 00000000 00:28 175641870                  /gpfs/share/software/intel/2013.1/composer_xe_2013_sp1.2.144/compiler/lib/intel64/libiomp5.so
2af4fb8dc000-2af4fbadc000 ---p 000e5000 00:28 175641870                  /gpfs/share/software/intel/2013.1/composer_xe_2013_sp1.2.144/compiler/lib/intel64/libiomp5.so
2af4fbadc000-2af4fbae7000 rw-p 000e5000 00:28 175641870                  /gpfs/share/software/intel/2013.1/composer_xe_2013_sp1.2.144/compiler/lib/intel64/libiomp5.so
2af4fbae7000-2af4fbb0f000 rw-p 00000000 00:00 0
2af4fbb0f000-2af4fbb26000 r-xp 00000000 fd:00 74195                      /usr/lib64/libpthread-2.17.so
2af4fbb26000-2af4fbd25000 ---p 00017000 fd:00 74195                      /usr/lib64/libpthread-2.17.so
2af4fbd25000-2af4fbd26000 r--p 00016000 fd:00 74195                      /usr/lib64/libpthread-2.17.so
2af4fbd26000-2af4fbd27000 rw-p 00017000 fd:00 74195                      /usr/lib64/libpthread-2.17.so
2af4fbd27000-2af4fbd2b000 rw-p 00000000 00:00 0
2af4fbd2b000-2af4fbeee000 r-xp 00000000 fd:00 66937                      /usr/lib64/libc-2.17.so
2af4fbeee000-2af4fc0ee000 ---p 001c3000 fd:00 66937                      /usr/lib64/libc-2.17.so
2af4fc0ee000-2af4fc0f2000 r--p 001c3000 fd:00 66937                      /usr/lib64/libc-2.17.so