geoschem / geos-chem

GEOS-Chem "Science Codebase" repository. Contains GEOS-Chem science routines, run directory generation scripts, and interface code. This repository is used as a submodule within the GCClassic and GCHP wrappers, as well as in other modeling contexts (external ESMs).
http://geos-chem.org
Other
167 stars 162 forks source link

[BUG/ISSUE] volInorg < H2Oinorg in N2O5_InorgOrg for near-zero inorganics #263

Closed LiamBindle closed 4 years ago

LiamBindle commented 4 years ago

Report a GEOS-Chem bug or technical issue

Describe the bug

Recently I updated my stretched-grid test suite to GCHPctm 13.0.0-alpha.3 which uses GEOS-Chem 12.7.2. In my latest suite of tests about half of them crashed in N2O5_InorgOrg() because of floating point exceptions. These floating point exceptions where because H2Oinorg>volInorg (by a very small amount) which caused volDryRatio to be negative: https://github.com/geoschem/geos-chem/blob/c4b68994f9720f246f38a76489bff4913cf489ed/KPP/Standard/gckpp_HetRates.F90#L2127)

This led to a FPE a bit later in N2O5_InorOrg. I added some print lines to N2O5_InorgOrg() to inspect volInorg and H2Oinorg right before the model crashed. That output was

 ERROR: volRatioDry < 0
 <Total volume>
       volInorg:    3.3898305084745676E-023
 +     volOrg:      7.0299460609683461E-022
       ----------
 =     volTotal:    7.3689291118158030E-022

 <Total H2O>
       H2OInorg:    3.3898305084745681E-023
 +     H2Oorg:      1.0263650675763886E-022
       ----------
 =     H2Ototal:    1.3653481184238455E-022

 <Dry Inorganic>
                 volInorg:    3.3898305084745676E-023
 -               H2Oinorg:    3.3898305084745681E-023
               -----------------
 =  numerator_volRatioDry:   -5.8774717541114375E-039

So, volInorg and H2Oinorg are approximately the same but H2Oinorg is slightly bigger.

For this to be the case, this means when N2O5_InorgOrg() is called XH2O(8) (inorganic aerosol water content) > XVOL(8) (inorganic aerosol specific volume). These are calculated here: https://github.com/geoschem/geos-chem/blob/c4b68994f9720f246f38a76489bff4913cf489ed/KPP/Standard/gckpp_HetRates.F90#L598-L608

and AeroArea, AeroRadi, and AeroH2O for sulfate are calculated here: https://github.com/geoschem/geos-chem/blob/c4b68994f9720f246f38a76489bff4913cf489ed/GeosCore/aerosol_mod.F#L1998-L2043

I added some print lines after line 2043 in aerosol_mod.F to inspect VH2O, VDry, RW(1), REFF. Their values right before the model crashed were:

 VH2O=   3.3898305084745746E-023
 VDry=   4.8297717151586559E-039
 RW(1)=  0.10100000000000001     
 REFF=   19337.715529726152     

 SCALER=   191462.52999728863     
 SCALEVOL=   7018614353625229.0     
 ERADIUS(I,J,L,N+NDUST)=   1.9337715529726154     

 WAERSL(I,J,L,N)=   8.2106119157697154E-036
 MSDENS(N)=   1700.0000000000000 

So VDry is 16 orders of magnitude smaller than VH2O which causes REFF to be huge: https://github.com/geoschem/geos-chem/blob/c4b68994f9720f246f38a76489bff4913cf489ed/GeosCore/aerosol_mod.F#L2018-L2019

I tried to traceback the handling of near-zero inorganics, but I got confused. Does anyone know what might be happening here? It almost looks like a precision/truncation problem to me.

I found that if I forced volDryRatio to 0 or greater it fixed the floating-point-exceptions and all my simulations ran to completion. Does anyone have any ideas?

Required information

Please include the following:

Additional context

yantosca commented 4 years ago

Hi Liam. Thanks for writing.

This seems to be a classic instance of floating point error caused by the difference of two variables that are very close to each other.

 <Dry Inorganic>
                 volInorg:    3.3898305084745676E-023
 -               H2Oinorg:    3.3898305084745681E-023
               -----------------
 =  numerator_volRatioDry:   -5.8774717541114375E-039

For reference, here are a few articles about this subject:

  1. https://stackoverflow.com/questions/16281685/c-floating-point-subtraction-error-and-absolute-values
  2. https://stackoverflow.com/questions/25434292/c-float-subtraction-rounding-error
  3. https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

I think it might be OK to set the numerator to zero if it's smaller than e.g. 1e-32 or some other very small number. 1e-39 is essentially zero. Or you could test if the numerator is a certain number of orders of magnitude smaller than the smaller of the values.

LiamBindle commented 4 years ago

Thanks Bob for the help and suggestion! I'll implement a fix like the one you suggest and open a PR.

LiamBindle commented 4 years ago

Just following up on this. I talked with @emcduffie about this last Wednesday and she had some concerns regarding volDryRatio=0. Specifically, I think the concern is water uptake on the inorganic is non-zero despite the inorganic itself being approximately non-zero (or at least, it's a lot less "non-zero"). She suggested I follow up with Chris Holmes so we can determine if this input to N2O5_InorgOrg() is problematic, or whether this is input is valid and N2O5_InorgOrg() is missing some logic.

(@emcduffie, please correct me if I've misexplained anything)

msulprizio commented 4 years ago

@cdholmes Do you have any thoughts on how to resolve this issue?

cdholmes commented 4 years ago

See this comment in aerosol_mod.F https://github.com/geoschem/geos-chem/blob/c4b68994f9720f246f38a76489bff4913cf489ed/GeosCore/aerosol_mod.F#L1994-L1997

This explains why the ISORROPIA water content can be non-zero when there is effectively zero sulfate-nitrate-ammonium. Until someone can provide the optical properties of organic-coated sulfate-nitrate-ammonium-seasalt aerosol, we are stuck using external mixing in aerosol_mod.F. Roundoff error could also contribute, as mentioned above.

Solutions

  1. As suggested already, volDryRatio should not be less than 0.
  2. We should prevent REFF from becoming unreasonably large in aerosol_mod.F, although it's not clear if large values will cause any problems. I suggest setting a maximum value of REFF <= 4 * RW(1)
cdholmes commented 4 years ago

Addressed in pull request #272

LiamBindle commented 4 years ago

Thanks @cdholmes for the update

yantosca commented 4 years ago

I will close out this issue. Feel free to reopen.

cdholmes commented 4 years ago

@yantosca I still recommend integrating my pull request #272 into the GC code. The pull request fixes the negative value issue and excessive growth identified by Liam. There's still a (longstanding) science issue regarding aerosol mixing state, but that shouldn't stop us from fixing these bugs.

yantosca commented 4 years ago

Hi Chris, thanks for writing. Your pull request is slated for 12.9.0. It will go in very soon.