geoschem / geos-chem

GEOS-Chem "Science Codebase" repository. Contains GEOS-Chem science routines, run directory generation scripts, and interface code. This repository is used as a submodule within the GCClassic and GCHP wrappers, as well as in other modeling contexts (external ESMs).
http://geos-chem.org
Other
168 stars 165 forks source link

[BUG/ISSUE] Typo in CloudHet #906

Closed viral211 closed 3 years ago

viral211 commented 3 years ago

There is a typo in CloudHet in gckpp_Hetrates.F90. The line https://github.com/geoschem/geos-chem/blob/ba58da4a6e447e47714fe5e7da488d15432a6fdc/KPP/fullchem/gckpp_HetRates.F90#L1282 should be: ff = min( ff, 1.0e30_dp )

I have run it by Chris Holmes @cdholmes. I think this would change the benchmark output significantly.

yantosca commented 3 years ago

Thanks Viral. Also tagging @msl3v on this. We can try to get this into 13.3.0 and benchmark.

yantosca commented 3 years ago

This has been fixed with commit c6adf4b2f3772caa8d2dedc8acb9990bf03a3016. Awaiting PR review and merge.

yantosca commented 3 years ago

This has now been merged into the 13.3.0 development branch and tagged with 13.3.0-alpha.4.

yantosca commented 3 years ago

Hi @viral211 and @cdholmes. I merged this PR as 13.3.0-alpha.4 but @msulprizio and I noticed that it caused fullchem sims to choke on the 2nd chemistry timestep with a could not integrate error:

---> DATE: 2019/07/01  UTC: 00:20  X-HRS:      0.333333
 Forced exit from Rosenbrock due to the following error:
 --> Step size too small: T + 10*H = T or H < Roundoff
 T=   962.73884987080180      and H=   5.0591750291364415E-013
 ### INTEGRATE RETURNED ERROR AT:           57          39          31
 Forced exit from Rosenbrock due to the following error:
 --> Step size too small: T + 10*H = T or H < Roundoff
 T=   966.10882298003025      and H=   5.1668987654245023E-013
 ## INTEGRATE FAILED TWICE !!! 
===============================================================================
GEOS-CHEM ERROR: Integrator error code : -7
STOP at INTEGRATE_KPP
===============================================================================

I am wondering if that MIN value is even needed. In my new hetrates code I now use a SafeDiv function to set the result at 1e30 if the division cannot be done. So we may be able to get away with this:

    ! Ratio of volume inside to outside cloud
    ! ff has a range [0,+inf], so cap it at 1e30
    ff = SafeDiv( H%CldFr, H%ClearFr, 1.0e+30_dp )
yantosca commented 3 years ago

Also, this was not caught in integration tests because the integration tests only do 1 chemistry timestep, and stopped before the timestep of the error. We should probably extend integration tests to 1hr of simulation runtime.

cdholmes commented 3 years ago

Omitting the the MIN line, as you have done, should be fine. Does that fix the integration error?

yantosca commented 3 years ago

It does not fix the integration error, I just treid.

The original code in the old gckpp_HetRates.F90 was:

https://github.com/geoschem/geos-chem/blob/ba58da4a6e447e47714fe5e7da488d15432a6fdc/KPP/fullchem/gckpp_HetRates.F90#L1276-L1294

If you print out some of the values of ff, xx, and kHet, there are sometimes negative xx and negative kHet:

 @@@ ff, xx, khet:    3.5772598845170236E-004   1.7895701188219704E-004   1.9458553343091576E-016
 @@@ ff, xx, khet:   0.15213769712086028       -2.7886791897719920E-002  -3.4170700869882024E-006
 @@@ ff, xx, khet:    3.8540670748149521E-002  -7.5296433906717475E-003  -1.2375804544089306E-007

I would think that you wouldn't want to have negative reaction rates. Should these be flushed to zero?

yantosca commented 3 years ago

@cdholmes, flushing negative xx to zero gets us past the integration error:

    !------------------------------------------------------------------------
    ! Grid-average loss frequency
    !
    ! EXACT expression for entrainment-limited uptake
    !------------------------------------------------------------------------

    ! Ratio (in cloud) of heterogeneous loss to detrainment, s/s
    kk = kI * tauc

    ! Ratio of volume inside to outside cloud
    ! ff has a range [0,+inf], so cap it at 1e30
    ff = SafeDiv( H%CldFr, H%ClearFr, 1.0e+30_dp )
    !ff = MAX( ff, 1.0e+30_dp )
    ff = MIN( ff, 1.0e+30_dp )

    ! Ratio of mass inside to outside cloud
    ! xx has range [0,+inf], but ff is capped at 1e30, so this shouldn't overflow
    xx =     ( ff - kk - 1.0_dp ) / 2.0_dp +                                 &
         SQRT( 1e0_dp + ff**2 + kk**2 + 2*ff**2 + 2*kk**2 - 2*ff*kk ) / 2.0_dp

    ! Prevent negative xx and negative kHet (bmy, 9/27/21)
    xx = MAX( xx, 0.0_dp )

    ! Overall heterogeneous loss rate, grid average, 1/s
    ! kHet = kI * xx / ( 1d0 + xx )
    !  Since the expression ( xx / (1+xx) ) may behave badly when xx>>1,
    !  use the equivalent 1 / (1 + 1/x) with an upper bound on 1/x
    kHet = kI / ( 1.0_dp + SafeDiv( 1.0_dp, xx, 1.0e+30_dp ) )

    ! Overall loss rate in a particular reaction branch, 1/s
    kHet = kHet * branch
cdholmes commented 3 years ago
  1. As a temporary fix, you can flush xx and khet to zero.
  2. Could you also print out kI and fc in addition to ff, xx, khet? I want to look into this further. (Might be roundoff error and catastrophic cancellation in xx calculation.)
viral211 commented 3 years ago

Hi @yantosca. The calculation of xx in the code you just pasted is from an older version. It had an error that we had fixed sometime back. The file on GitHub has the correct calculation. It should be:

xx = ( ff- kk- 1.0_dp ) / 2.0_dp + sqrt( 1.0_fp + ff*ff + kk*kk + 2.0_dp*ff + 2.0_dp*kk - 2.0_dp*ff*kk ) / 2.0_dp

I wonder if that's he reason for negative values of xx.

yantosca commented 3 years ago

@cdholmes, here is printout from a few locations where xx and khet are zero:

 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    5.8539284960727474E-006
 @@@ fc    :    1.2736767530441284E-003
 @@@ 1-fc  :   0.99872632324695587     
 @@@ kk    :    2.1074142585861891E-002
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :    1.2753010743756928E-003
 @@@ xx    :   -9.5786521776447531E-003
 @@@ khet  :   -5.6615040719761695E-008
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    4.3832103667774089E-005
 @@@ fc    :    1.2736767530441284E-003
 @@@ 1-fc  :   0.99872632324695587     
 @@@ kk    :   0.15779557320398671     
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :    1.2753010743756928E-003
 @@@ xx    :   -6.0017737611452016E-002
 @@@ khet  :   -2.7986737645513250E-006
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    8.6356085558678892E-007
 @@@ fc    :    1.2736767530441284E-003
 @@@ 1-fc  :   0.99872632324695587     
 @@@ kk    :    3.1088190801124402E-003
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :    1.2753010743756928E-003
 @@@ xx    :   -9.1027302341539773E-004
 @@@ khet  :   -7.8679234676646512E-010
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    3.9933645909351163E-005
 @@@ fc    :    1.2736767530441284E-003
 @@@ 1-fc  :   0.99872632324695587     
 @@@ kk    :   0.14376112527366419     
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :    1.2753010743756928E-003
 @@@ xx    :   -5.6063335358846111E-002
 @@@ khet  :   -2.3717834750793083E-006
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    8.8885181978303368E-007
 @@@ fc    :    1.2736767530441284E-003
 @@@ 1-fc  :   0.99872632324695587     
 @@@ kk    :    3.1998665512189212E-003
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :    1.2753010743756928E-003
 @@@ xx    :   -9.5542402800685711E-004
 @@@ khet  :   -8.5004253702303078E-010
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    3.9933645909351163E-005
 @@@ fc    :    1.2736767530441284E-003
 @@@ 1-fc  :   0.99872632324695587     
 @@@ kk    :   0.14376112527366419     
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :    1.2753010743756928E-003
 @@@ xx    :   -5.6063335358846111E-002
 @@@ khet  :   -2.3717834750793083E-006
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    3.9044794089568130E-005
 @@@ fc    :    1.2736767530441284E-003
 @@@ 1-fc  :   0.99872632324695587     
 @@@ kk    :   0.14056125872244526     
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :    1.2753010743756928E-003
 @@@ xx    :   -5.5124085660023581E-002
 @@@ khet  :   -2.2778743127078089E-006
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    3.9044794089568130E-005
 @@@ fc    :    1.2736767530441284E-003
 @@@ 1-fc  :   0.99872632324695587     
 @@@ kk    :   0.14056125872244526     
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :    1.2753010743756928E-003
 @@@ xx    :   -5.5124085660023581E-002
 @@@ khet  :   -2.2778743127078089E-006
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    7.7444165929824418E-005
 @@@ fc    :   0.11744181811809540     
 @@@ 1-fc  :   0.88255818188190460     
 @@@ kk    :   0.27879899734736791     
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :   0.13306977435489950     
 @@@ xx    :   -2.2385298085750183E-002
 @@@ khet  :   -1.7733067392980751E-006
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    4.3817520052903658E-005
 @@@ fc    :   0.10060337185859680     
 @@@ 1-fc  :   0.89939662814140320     
 @@@ kk    :   0.15774307219045317     
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :   0.11185651436840825     
 @@@ xx    :   -4.0754996429750445E-003
 @@@ khet  :   -1.7930906134716327E-007
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    4.2738640622205602E-005
 @@@ fc    :   0.10060337185859680     
 @@@ 1-fc  :   0.89939662814140320     
 @@@ kk    :   0.15385910623994017     
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :   0.11185651436840825     
 @@@ xx    :   -2.7993019017954479E-003
 @@@ khet  :   -1.1997420198567723E-007
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    4.2738640622205602E-005
 @@@ fc    :   0.10060337185859680     
 @@@ 1-fc  :   0.89939662814140320     
 @@@ kk    :   0.15385910623994017     
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :   0.11185651436840825     
 @@@ xx    :   -2.7993019017954479E-003
 @@@ khet  :   -1.1997420198567723E-007
 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
 @@@ kI    :    8.3063899387568819E-006
 @@@ fc    :    1.3083992525935173E-002
 @@@ 1-fc  :   0.98691600747406483     
 @@@ kk    :    2.9903003779524776E-002
 @@@ tauc  :    3600.0000000000000     
 @@@ ff    :    1.3257452941129854E-002
 @@@ xx    :   -7.7188966513230817E-003
 @@@ khet  :   -6.4614921383144082E-008

I have a larger log file (75 MB) if you'd liek to see the whole thing.

yantosca commented 3 years ago

@viral211 thanks. I was sure that I had fixed that in the new code but I guess it fell thru the cracks. Sorry for the bother.

cdholmes commented 3 years ago

It looks like the bug in the xx definition was causing the negative values. I checked a few examples from the log excerpt that Bob sent. Can you confirm that the negatives are gone when xx is defined correctly?

yantosca commented 3 years ago

I was able to run a 1-day fullchem_benchmark run and I didn't get the integration error. With the negatives the run bombed out before 00:30. So I think we are OK now.

msulprizio commented 3 years ago

I ran an internal 1-month benchmark to evaluate the impact of these changes (tagged as 13.3.0-alpha.4). The benchmark plots and tables may be viewed at http://ftp.as.harvard.edu/gcgrid/geos-chem/validation/InternalBenchmarks/GCC_13.3.0-alpha.4/. The OH metrics are included below for quick review:

###############################################################################
### OH Metrics
### Ref = GCC_13.3.0-alpha.3; Dev = GCC_13.3.0-alpha.4
###############################################################################

------------------------------------------------------------
Global mass-weighted OH concentration [10^5 molec cm^-3]
------------------------------------------------------------
Ref      : 11.05722637197
Dev      : 11.79245633477
Abs diff :  0.73522996280
%   diff :  6.649316

------------------------------------------------------------
CH3CCl3 (aka MCF) lifetime w/r/t tropospheric OH [years]
------------------------------------------------------------
Ref      :  5.634232
Dev      :  5.246796
Abs diff : -0.387436
%   diff : -6.876466

------------------------------------------------------------
CH4 lifetime w/r/t tropospheric OH [years]
------------------------------------------------------------
Ref      :  9.487550
Dev      :  8.841421
Abs diff : -0.646129
%   diff : -6.810283

Changes are considerably large, as suspected by Viral in the original post above. I will close out this issue for now since this will officially be benchmarked in 13.3.0, but @viral211, @cdholmes, or others, please feel free to comment here on any of the changes to model output.

viral211 commented 3 years ago

Thanks @msulprizio. I was expecting a general increase in ozone because NOx loss on clouds is now much slower. The decrease in ozone over the southern ocean was unexpected. It is caused by the increase in Br, but I am not sure what drives it. I spoke with Xuan Wang and we suspect that it is because the loss of Br through the HOBr+S(IV) reaction in clouds is now slower (along with other HOBr cloud reactions) and more HOBr is available to react with HBr/Br-/Cl- on aerosols and release radicals. Xuan is looking into this further.