geoschem / geos-chem

GEOS-Chem "Science Codebase" repository. Contains GEOS-Chem science routines, run directory generation scripts, and interface code. This repository is used as a submodule within the GCClassic and GCHP wrappers, as well as in other modeling contexts (external ESMs).
http://geos-chem.org
Other
169 stars 165 forks source link

KPP integration error in 14.4.0 #2291

Closed ktravis213 closed 1 month ago

ktravis213 commented 6 months ago

Name and Institution (Required)

Name: Katie Travis Institution: NASA LaRC

Confirm you have reviewed the following documentation

Description of your issue or question

Hi support team. WhenI try and run 14.4.0-rc.0 at 0.25x0.3125 resolution to test it out against KORUS-AQ observations, I get an immediate KPP convergence issue. I changed RTOL as far down as RTOL = 0.5e-4_dp (attached log file) and as high as 1e-02 and still can’t get past the first 30 min of simulation time. Are there any other tricks for addressing this issue?

Please provide as much detail as possible. Always include the GEOS-Chem version number and any relevant configuration and log files.

geoschem_config.yml.txt test4Q.log HEMCO_Config.rc.txt

yantosca commented 6 months ago

Hi @ktravis213, thanks for writing. Looking at your log file it seems like there are several species and reaction rates going negative. Here is what I would recommend:

  1. Turn off KORUS and KORUS_SHIP emissions, and see if you get past 30 minutes into the run without the error.
  2. If the simulation succeds, then turn off KORUS_SHIP, and: a. Comment out all KORUS data entries in HEMCO_Config.rc except the first one b. Turn on KORUS emissions and see if the simulation gets past 30 mins c. Uncomment the next entry (or a group of a few entries if you want to speed it up).
    d. Repeat until you find the line that causes the error.
  3. Turn on KORUS_SHIP emissions and repeat steps a, b, c, d above.
ktravis213 commented 6 months ago

Hi @yantosca Unfortunately turning off KORUS and KORUS_SHIP did not work. Any other things I can try? test4Q.log

yantosca commented 6 months ago

Hi @ktravis213. I noticed this in your log file:

Min and Max of each species in restart file [mol/mol]:
===============================================================================
R E S T A R T   F I L E   I N P U T

Species   1,     ACET: Min = ***************  Max = 5.885225196E-09  Sum = 1.154265017E-03
Species   2,     ACTA: Min = ***************  Max = 1.056943155E-08  Sum = 1.052050211E-04
Species   3,     AERI: Min = ***************  Max = 1.420597667E-11  Sum = 6.413765163E-07
Species   4,     ALD2: Min = ***************  Max = 5.066523645E-09  Sum = 1.502113882E-04
Species   5,     ALK4: Min = ***************  Max = 9.273107615E-09  Sum = 2.015010396E-04
Species   6,   AONITA: Min = ***************  Max = 3.313867514E-10  Sum = 1.417772182E-05
Species   7,   AROMP4: Min = ***************  Max = 2.661012556E-11  Sum = 1.617183756E-07

From our 1-year benchmark simulation (e.g. 2019/09/01) we see:

===============================================================================
R E S T A R T   F I L E   I N P U T

Min and Max of each species in restart file [mol/mol]:
Species   1,     ACET: Min = 1.239827638E-24  Max = 7.150300796E-09  Sum = 5.773648445E-05
Species   2,     ACTA: Min = 2.193187342E-24  Max = 1.153883922E-08  Sum = 3.664986934E-06
Species   3,     AERI: Min = 6.434664190E-16  Max = 1.539834232E-11  Sum = 8.192326106E-08
Species   4,     ALD2: Min = 2.495752772E-25  Max = 7.401605995E-09  Sum = 6.675795703E-06
Species   5,     ALK4: Min = 0.000000000E+00  Max = 2.556451939E-08  Sum = 2.885505637E-05
Species   6,    ASOA1: Min = 2.649375491E-19  Max = 2.177687371E-11  Sum = 4.151596400E-08
Species   7,    ASOA2: Min = 7.523766664E-21  Max = 2.433376937E-11  Sum = 1.694504803E-08
Species   8,    ASOA3: Min = 3.683365151E-20  Max = 5.107953990E-11  Sum = 2.480202888E-08
Species   9,    ASOAN: Min = 1.022265260E-14  Max = 5.290937133E-11  Sum = 2.470934533E-07
Species  10,    ASOG1: Min = 6.106739620E-17  Max = 4.472621058E-12  Sum = 8.024762366E-08
Species  11,    ASOG2: Min = 2.587870611E-17  Max = 1.020796642E-11  Sum = 5.495506272E-08
Species  12,    ASOG3: Min = 5.554602781E-16  Max = 8.505734311E-11  Sum = 1.128259328E-06
Species  13,   AONITA: Min = 6.299955247E-16  Max = 2.234366164E-10  Sum = 2.397227661E-07
Species  14,   AROMP4: Min = 3.298336416E-36  Max = 1.244319925E-11  Sum = 9.599870898E-10

I wonder if you have small negative concentrations in the restart file. If that is the case, that might be causing the KPP convergence error. Can you open the restart file with a netCDF viewer (ncview, panoply) and check the min & max values of e.g. the ACET species?

If this is a nested simulation you might also want to take a quick look at the min & max values in your GEOSChem.BoundaryConditions*.nc4 files.

ktravis213 commented 6 months ago

@yantosca You are right, there are small negative concentrations in my restart file. How is that possible?

yantosca commented 6 months ago

@ktravis213, I wonder if they could have crept in during spinup. It could also be that there is an out-of-bounds or parallelization error somewhere in the code. Did you use an out-of-the-box version or did you add any updates?

ktravis213 commented 5 months ago

Hi @yantosca. I noticed that my restart file, which I regridded from 2x25, had negative values when I used cdo remapbic, but this went away when I used cdo remapcon. However, even with a new restart file and no negatives, I still get crazy chemistry results and a KPP error after two timesteps. See test4Q.log. However I turned off photolysis, and it ran fine. I tried switching to FASTJ instead of CLOUDJ though, and it also crashed (test4R.log). So I am still mystified by what could be going on, other than it is something to do with photolysis.

test4Q.log test4R.log

yantosca commented 5 months ago

Hi @ktravis213. Do you still get the same error if you don't regrid the restart file for the initial run? HEMCO should be able to regrid to 0.25 x 0.3125. Wondering if there is some weirdness in the regridding, even with remapcon.

There is a known issue with remapbis, see:

ktravis213 commented 5 months ago

Oh wow I did not know I didn't have to regrid the restart file myself. This is great to know! Unfortunately it still crashes though in the same manner. Would you be willing to try a 0.25x0.3125 simulation yourself and see if it also crashes for you? I am sure things are very busy before IGC11, we could also talk there.

yantosca commented 5 months ago

Hi @ktravis213, I could try tor replicate your issue. Cannon is down for maintenance this week so I wouldn't be able to get to it for a few days.

ktravis213 commented 5 months ago

Hi @yantosca. I just want to add that a 2x25 simulation runs fine, so it is just a nested issue.

yantosca commented 5 months ago

Hi @ktravis213. Thanks for letting me know. At this point we are all preparing for IGC11, so if you come by during the office hours there we can have a look at your code.

wahababdul638 commented 5 months ago

the log files are from GEOS-Chem v14.3.1 and is the same issue with my #2323 issue. i faced this error on global run on v14.2.3

ktravis213 commented 4 months ago

@wahababdul638 and @yantosca I just thought I would try turning on photolysis of particulate nitrate, since it seemed to run without photolysis on at all. And just turning off photolysis of particulate nitrate seems to resolve the KPP error. Not sure how to fix this, but at least the simulation runs!

yantosca commented 1 month ago

Closing out this issue