Open bcraig99 opened 2 months ago
Thanks for writing @bcraig99. The HEMCO/SAMPLE_BCs/GC_14.3.0/fullchem/
folder contains a single boundary condition file that we use for running integration tests on the nested-grid models. I believe it only contains 1 day of data (or maybe even less, I haven't checked it in a while). What is probably happening is that your simulation has moved beyond the last time in the boundary conditions file, and thus has thrown an error.
If you plan on doing a nested-grid simulation, you must first run a global simulation in order to save out boundary conditions (frequency: 3hrs, duration: 24hrs) that will be applied at the edges of your nested domain. We have instructions on how to do this on ReadTheDocs:
BTW, a Segmentation Fault means tried to access a memory element that does not exist. It can be a side-effect of exiting a simulation with an error. For more information about this and other types of errors, see our documentation at:
Thanks @yantosca! I can see where it went wrong. I'm new to GEOS-Chem, what is the difference between running a nested-grid model vs something else?
Thanks @bcraig99. A nested-grid model is when you run for a small window of the world in GEOS-Chem Classic, as opposed to a global simulation. With a nested-grid simulation you can run at very fine resolution (0.25 x 0.3125 degree or 0.5 x 0.625 degree). Because it is computationally intensive to run at fine resolution, the trade-off is to only run over a region of the globe that you are interested in.
If you are new to GEOS-Chem I would recommend to read through the https://geos-chem.readthedocs.io manual since that goes into great detail about the options you can use with GEOS-Chem.
Thanks again and happy modeling!
I created a gc_4x5_merra2_fullchem simulation. I followed the instructions up to step 5 here https://geos-chem.readthedocs.io/en/latest/supplemental-guides/nested-grid-guide.html. After running the geoschem program I get this
Getting CH4 boundary conditions in GEOS-Chem from :NOAA_GMD_CH4
HEMCO (VOLCANO): Opening /path/to/ExtData/HEMCO/VOLCANO/v2024-04/2019/07/so2_volcanic_emissions_Carns.20190701.rc
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 85032 on node notch081 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Thanks for writing @bcraig99. You should not try to use mpirun to run GEOS-Chem Classic. That might be the cause of your error. mpirun would only be needed if you run GCHP.
What would you reccomend using to run GEOS-Chem Classic?
Hi @bcraig99, what command are you using to run GEOS-Chem? Have you tried compiling with debug flags and enabling maximum prints to log? We have a debug guide on ReadTheDocs that goes over all the strategies to figure out what is going wrong. See https://geos-chem.readthedocs.io/en/stable/geos-chem-shared-docs/supplemental-guides/debug-guide.html.
mpirun -np 1 ./gcclassic | tee GC.log
I abandoned a fullchem model and started running just an aerosol model. The global simulation seemed to run fine for my boundary condition files, but my nested grid simulation works until the last day of the simulation where it throws the following error
---> DATE: 2019/07/31 UTC: 23:55
- Creating file for Aerosols; reference = 20190701 000000
with filename = OutputDir/GEOSChem.Aerosols.20190701_0000z.nc4
- Creating file for AerosolMass; reference = 20190701 000000
with filename = OutputDir/GEOSChem.AerosolMass.20190701_0000z.nc4
- Creating file for SpeciesConc; reference = 20190701 000000
with filename = OutputDir/GEOSChem.SpeciesConc.20190701_0000z.nc4
- Creating file for Restart; reference = 20190801 000000
with filename = ./Restarts/GEOSChem.Restart.20190801_0000z.nc4
---> DATE: 2019/08/01 UTC: 00:00
GET_BOUNDARY_CONDITIONS: Done reading BCs at 2019/08/01 00:00 using 0 1
corrupted size vs. prev_size
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0x7f3b2ed39171 in ???
#1 0x7f3b2ed38313 in ???
#2 0x7f3b2dd70b4f in ???
#3 0x7f3b2dd70acf in ???
#4 0x7f3b2dd43ea4 in ???
#5 0x7f3b2ddb1cd6 in ???
#6 0x7f3b2ddb8fdb in ???
#7 0x7f3b2ddb9885 in ???
#8 0x7f3b2ddbaf0a in ???
#9 0x90e940 in __phot_container_mod_MOD_cleanup_phot_container
at path/to/gc_05x0625_CU_merra2_aerosol/CodeDir/src/GEOS-Chem/Headers/phot_container_mod.F90:732
#10 0x8968fe in __state_chm_mod_MOD_cleanup_state_chm
at path/to/gc_05x0625_CU_merra2_aerosol/CodeDir/src/GEOS-Chem/Headers/state_chm_mod.F90:3087
#11 0x407863 in geos_chem
at path/to/gc_05x0625_CU_merra2_aerosol/CodeDir/src/GEOS-Chem/Interfaces/GCClassic/main.F90:1983
#12 0x404566 in main
at path/to/gc_05x0625_CU_merra2_aerosol/CodeDir/src/GEOS-Chem/Interfaces/GCClassic/main.F90:32
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 3205599 on node notch081 exited on signal 6 (Aborted).
--------------------------------------------------------------------------
This issue has been automatically marked as stale because it has not had recent activity. If there are no updates within 7 days it will be closed. You can add the "never stale" tag to prevent the issue from closing this issue.
Thanks @bcraig99. Sorry for the late reply.
There is a techy description of the corrupted size vs. prev_size
at this Stack Overflow post. TL;DR: It can be caused by an out-of-bounds error in an array that is being deallocated. This causes a memory leak which triggers the abort signal.
You can try reconfiguring with cmake -DCMAKE_RELEASE_TYPE=Debug ...etc...
, which will turn on array bounds checking (among other debug options). This will stop the run if an array goes out of bounds.
We would also suggest migrating to GEOS-Chem 14.5.0, which uses the most recent version of Cloud-J photolysis (as this is where the error was).
Your name
Broderik Craig
Your affiliation
University of Utah
Please provide a clear and concise description of your question or discussion topic.
I'm running a full chemistry simulation from 2018/12/01-2019/02/01 in longitude range -112.30457 -111.603284 and latitude range 39.97557 41.52831.
My log file contains the following errors:
and my slurm error log returns
and after checking the data download output I have many instances that look like this
I'm solidly stumped, any feedback is appreciated.