Closed gopikrishnangs44 closed 1 month ago
I will transfer this issue to the GEOS-Chem "science codebase" repository. The GCClassic issue tracker is for issues pertaining to the GCClassic wrapper itself.
@yantosca Thank you for the response. I will wait for the solution for the same.
Thanks @gopikrishnangs44. I think your job might have exceeded the available memory on the node. Are you using cropped met field data for the nested-grid simulation? That will reduce both memory and run time. See the Crop netCDF files Chapter on ReadTheDocs for more information.
Also we have some information about [Segmentation faults and similar errors[(https://geos-chem.readthedocs.io/en/latest/geos-chem-shared-docs/supplemental-guides/error-guide.html#segmentation-faults-and-similar-errors) on ReadTheDocs.
Could you attach the following to this issue?
Another way to reduce memory usage is to only archive the species that you need for diagnostics rather than all species. For example, if you only wanted to save out CO and O3 in the SpeciesConc
collection, you can list individual fields
SpeciesConcVV_CO
, SpeciesConcVV_O3
instead of SpeciesConc_?ADV?
, which would save out all advected species.
I am trying to save the restart file only. I just tried a re-run. Now the error is
corrupted double-linked list
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0x15555363eb4f in ???
#1 0x15555363eacf in ???
#2 0x155553611ea4 in ???
#3 0x15555367fcc6 in ???
#4 0x155553686fcb in ???
#5 0x15555368785b in ???
#6 0x155553688efa in ???
#7 0x5dcec8 in do_window_transport
at /burg/fiore_new/users/gg2995/test1/gc_05x0625_merra2_fullchem/CodeDir/src/GEOS-Chem/GeosCore/transport_mod.F90:578
#8 0x5dcec8 in __transport_mod_MOD_do_transport
at /burg/fiore_new/users/gg2995/test1/gc_05x0625_merra2_fullchem/CodeDir/src/GEOS-Chem/GeosCore/transport_mod.F90:220
#9 0x407a33 in geos_chem
at /burg/fiore_new/users/gg2995/test1/gc_05x0625_merra2_fullchem/CodeDir/src/GEOS-Chem/Interfaces/GCClassic/main.F90:1164
#10 0x405556 in main
at /burg/fiore_new/users/gg2995/test1/gc_05x0625_merra2_fullchem/CodeDir/src/GEOS-Chem/Interfaces/GCClassic/main.F90:32
real 145.72
user 545.23
sys 16.11
srun: error: g123: task 0: Exited with exit code 134
Attaching the files for your reference
Thanks @gopikrishnangs44. It very much seems like the second error described in this chapter on ReadTheDocs:
But I'm curious as you have the stacksize limits maxed out. What type of system are you using?
I have treid increasing the stack size using the link.
export OMP_NUM_THREADS=32
export F_UFMTENDIAN=big
export OMP_STACKSIZE=3000m
ulimit -s unlimited
I am using the ginsburg cluster in Columbia,. https://columbiauniversity.atlassian.net/wiki/spaces/rcs/pages/62141888/Ginsburg+-+Technical+Information
PS: I have uploaded wrong files in the previous comment, which is now edited.
I am also attaching the slurm script and environment file for your reference gc_spack.txt sbatch_script.txt
HI @yantosca,
do you find the slurm script file okay to submit the job?
@yantosca
I used the Bufferzone_NSEW as [1,1,1,1], which should be [3,3,3,3], breaking the TPCORE advection scheme for a global nested grid.
Thank you for the responses.
Your name
Gopikrishnan
Your affiliation
Columbia University, NY
Please provide a clear and concise description of your question or discussion topic.
I am running GCClassic version 14.4.3. Geos Chem runs perfect with the MERRA2 4x5 gridding simulations and I saved the BC to the Outputdirs.
The error occurs when I try to run the nested version the model. The time stepping begins and the model just stops as TP core.
Both my slurm script and the env file has
And these are the memory asked for in the shared system
The slurm output is
Please see the issue.