geoschem / GCHP

The "superproject" wrapper repository for GCHP, the high-performance instance of the GEOS-Chem chemical-transport model.
https://gchp.readthedocs.io
Other
23 stars 25 forks source link

Full Chemistry Simulation - Soft Error during file loading #380

Closed YvarVliex closed 4 months ago

YvarVliex commented 8 months ago

Name: Yvar Vliex Institution: Delft University of Technology

Dear GCST,

I am trying to run a full chemistry simulation using GCHP 14.3.0 (but I had the same issue with version 14.2.3) with just the basic emission inventories on the CS24. I seem to run into problems during the loading or reading of some input files when I try to run the simulation. The full verbose debugger output shows the following two errors. These errors do not lead to an immediate failure of the simulation, the simulation just gets stuck until the time of the job expires (even when I provide more than enough time for the job). All my symbolic links seem to point to the right locations/files and I have the ExtData available. Do you have any suggestions for where this problem comes from and how to solve it?

Thank you in advance!

log_error1

log_error2

Full output log: gchp.20190101_0000z.log

yantosca commented 8 months ago

Thanks for writing @YvarVliex. We have also seen this happen (see https://github.com/geoschem/MAPL/issues/30). This can be caused by an input file that is not strictly conforming to the MAPL input requirements.

I am in the process of bringing a fix to add debug printout for each container (i.e. entry in ExtData.rc) that is being read. This can give you some insight as to which file MAPL is hanging on. See https://github.com/geoschem/MAPL/pull/31. In the meantime you can apply the fix shown in this PR, rebuild GCHP, and then set in logging.yml:

CAP.EXTDATA:
   level: DEBUG
   root-level: DEBUG

Once you find the container name on which GCHP is hanging, you can check the netCDF coordinates with e.g. ncdump -cts and then see if there area any non-conforming attibutes. Then you can use utilities from NCO or CDO to fix the non-conforming attributes.

Also see our guides on ReadTheDocs:

lizziel commented 8 months ago

HI @YvarVliex, could you check that your restart file is not corrupt? The log indicates there is an issue with the restart file since the error message first appears here:

 Character Resource Parameter: GCHPchem_INTERNAL_RESTART_FILE:gchp_restart.nc4
 Character Resource Parameter: MAPL_ENABLE_BOOTSTRAP:YES
 ESMF_StatePrint: (pet 0):
  State name: GCHPchem_INTERNAL
pe=00000 FAIL at line=04140    NCIO.F90                                 <unknown error>
pe=00000 FAIL at line=05954    MAPL_Generic.F90                         <status=-1>

Line 4140 in NCIO.F90 indicates a problem checking the file type.

   ! Attempt to identify as fortran binary
   cwrd = transfer(TwoWords(1:4), irec)
   ! check if divisible by 4
   irec = cwrd/4
   filetype = irec
   if (cwrd /= 4*irec) then
      _RETURN(ESMF_FAILURE)
   end if

Where are you getting the restart file from?

YvarVliex commented 8 months ago

Dear @yantosca and @lizziel,

Thank you very much for your responses and suggestions! Indeed, the error seems to be in the restart file (which I downloaded from here http://geoschemdata.wustl.edu/ExtData/GEOSCHEM_RESTARTS/). Redownloading the restart files for version 14.3.0 seemed to at least solve this error such that GEOS-Chem now starts. I've also tried it for 14.2.3, where the error remains for the 20190101.c24 file but seems to be fixed for the 20190701.c24 file.

I'll continue working with GHCP version 14.3.0 for which the restart file (using the 20190101.c24 one) seems to work and try to fix the other issues that I have now.

lizziel commented 8 months ago

Hi @YvarVliex, this is strange since all of the restarts should work. We use the c24 ones for for benchmarking (Jan 1 for 1-year and July 1 for 1-month). Are you hitting exactly the same error as your original error for the runs that are still having problems?

YvarVliex commented 8 months ago

Hi @lizziel, yes for the Jan 1 restart file for GCHP version 14.2.3 I still get the same error as initially. However, GCHP version 14.3.0 is working fine now.

lizziel commented 8 months ago

I am able to reproduce this issue when running GCHP 14.2.3 with the 01Jan2019 fullchem restart file stored in GEOSCHEM_RESTARTS/GC_14.2. Since 14.3.0 is working fine I won't prioritize this issue right now but I am going to keep this issue open in case there is time to look more into it.

lizziel commented 4 months ago

I will close this issue now as we are no longer supporting versions 14.2.