Closed viral211 closed 3 years ago
Thanks Viral. Also tagging @msl3v on this. We can try to get this into 13.3.0 and benchmark.
This has been fixed with commit c6adf4b2f3772caa8d2dedc8acb9990bf03a3016. Awaiting PR review and merge.
This has now been merged into the 13.3.0 development branch and tagged with 13.3.0-alpha.4.
Hi @viral211 and @cdholmes. I merged this PR as 13.3.0-alpha.4 but @msulprizio and I noticed that it caused fullchem sims to choke on the 2nd chemistry timestep with a could not integrate error:
---> DATE: 2019/07/01 UTC: 00:20 X-HRS: 0.333333
Forced exit from Rosenbrock due to the following error:
--> Step size too small: T + 10*H = T or H < Roundoff
T= 962.73884987080180 and H= 5.0591750291364415E-013
### INTEGRATE RETURNED ERROR AT: 57 39 31
Forced exit from Rosenbrock due to the following error:
--> Step size too small: T + 10*H = T or H < Roundoff
T= 966.10882298003025 and H= 5.1668987654245023E-013
## INTEGRATE FAILED TWICE !!!
===============================================================================
GEOS-CHEM ERROR: Integrator error code : -7
STOP at INTEGRATE_KPP
===============================================================================
I am wondering if that MIN value is even needed. In my new hetrates code I now use a SafeDiv function to set the result at 1e30 if the division cannot be done. So we may be able to get away with this:
! Ratio of volume inside to outside cloud
! ff has a range [0,+inf], so cap it at 1e30
ff = SafeDiv( H%CldFr, H%ClearFr, 1.0e+30_dp )
Also, this was not caught in integration tests because the integration tests only do 1 chemistry timestep, and stopped before the timestep of the error. We should probably extend integration tests to 1hr of simulation runtime.
Omitting the the MIN line, as you have done, should be fine. Does that fix the integration error?
It does not fix the integration error, I just treid.
The original code in the old gckpp_HetRates.F90 was:
If you print out some of the values of ff, xx, and kHet, there are sometimes negative xx and negative kHet:
@@@ ff, xx, khet: 3.5772598845170236E-004 1.7895701188219704E-004 1.9458553343091576E-016
@@@ ff, xx, khet: 0.15213769712086028 -2.7886791897719920E-002 -3.4170700869882024E-006
@@@ ff, xx, khet: 3.8540670748149521E-002 -7.5296433906717475E-003 -1.2375804544089306E-007
I would think that you wouldn't want to have negative reaction rates. Should these be flushed to zero?
@cdholmes, flushing negative xx to zero gets us past the integration error:
!------------------------------------------------------------------------
! Grid-average loss frequency
!
! EXACT expression for entrainment-limited uptake
!------------------------------------------------------------------------
! Ratio (in cloud) of heterogeneous loss to detrainment, s/s
kk = kI * tauc
! Ratio of volume inside to outside cloud
! ff has a range [0,+inf], so cap it at 1e30
ff = SafeDiv( H%CldFr, H%ClearFr, 1.0e+30_dp )
!ff = MAX( ff, 1.0e+30_dp )
ff = MIN( ff, 1.0e+30_dp )
! Ratio of mass inside to outside cloud
! xx has range [0,+inf], but ff is capped at 1e30, so this shouldn't overflow
xx = ( ff - kk - 1.0_dp ) / 2.0_dp + &
SQRT( 1e0_dp + ff**2 + kk**2 + 2*ff**2 + 2*kk**2 - 2*ff*kk ) / 2.0_dp
! Prevent negative xx and negative kHet (bmy, 9/27/21)
xx = MAX( xx, 0.0_dp )
! Overall heterogeneous loss rate, grid average, 1/s
! kHet = kI * xx / ( 1d0 + xx )
! Since the expression ( xx / (1+xx) ) may behave badly when xx>>1,
! use the equivalent 1 / (1 + 1/x) with an upper bound on 1/x
kHet = kI / ( 1.0_dp + SafeDiv( 1.0_dp, xx, 1.0e+30_dp ) )
! Overall loss rate in a particular reaction branch, 1/s
kHet = kHet * branch
Hi @yantosca. The calculation of xx
in the code you just pasted is from an older version. It had an error that we had fixed sometime back. The file on GitHub has the correct calculation. It should be:
xx = ( ff- kk- 1.0_dp ) / 2.0_dp + sqrt( 1.0_fp + ff*ff + kk*kk + 2.0_dp*ff + 2.0_dp*kk - 2.0_dp*ff*kk ) / 2.0_dp
I wonder if that's he reason for negative values of xx.
@cdholmes, here is printout from a few locations where xx and khet are zero:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 5.8539284960727474E-006
@@@ fc : 1.2736767530441284E-003
@@@ 1-fc : 0.99872632324695587
@@@ kk : 2.1074142585861891E-002
@@@ tauc : 3600.0000000000000
@@@ ff : 1.2753010743756928E-003
@@@ xx : -9.5786521776447531E-003
@@@ khet : -5.6615040719761695E-008
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 4.3832103667774089E-005
@@@ fc : 1.2736767530441284E-003
@@@ 1-fc : 0.99872632324695587
@@@ kk : 0.15779557320398671
@@@ tauc : 3600.0000000000000
@@@ ff : 1.2753010743756928E-003
@@@ xx : -6.0017737611452016E-002
@@@ khet : -2.7986737645513250E-006
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 8.6356085558678892E-007
@@@ fc : 1.2736767530441284E-003
@@@ 1-fc : 0.99872632324695587
@@@ kk : 3.1088190801124402E-003
@@@ tauc : 3600.0000000000000
@@@ ff : 1.2753010743756928E-003
@@@ xx : -9.1027302341539773E-004
@@@ khet : -7.8679234676646512E-010
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 3.9933645909351163E-005
@@@ fc : 1.2736767530441284E-003
@@@ 1-fc : 0.99872632324695587
@@@ kk : 0.14376112527366419
@@@ tauc : 3600.0000000000000
@@@ ff : 1.2753010743756928E-003
@@@ xx : -5.6063335358846111E-002
@@@ khet : -2.3717834750793083E-006
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 8.8885181978303368E-007
@@@ fc : 1.2736767530441284E-003
@@@ 1-fc : 0.99872632324695587
@@@ kk : 3.1998665512189212E-003
@@@ tauc : 3600.0000000000000
@@@ ff : 1.2753010743756928E-003
@@@ xx : -9.5542402800685711E-004
@@@ khet : -8.5004253702303078E-010
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 3.9933645909351163E-005
@@@ fc : 1.2736767530441284E-003
@@@ 1-fc : 0.99872632324695587
@@@ kk : 0.14376112527366419
@@@ tauc : 3600.0000000000000
@@@ ff : 1.2753010743756928E-003
@@@ xx : -5.6063335358846111E-002
@@@ khet : -2.3717834750793083E-006
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 3.9044794089568130E-005
@@@ fc : 1.2736767530441284E-003
@@@ 1-fc : 0.99872632324695587
@@@ kk : 0.14056125872244526
@@@ tauc : 3600.0000000000000
@@@ ff : 1.2753010743756928E-003
@@@ xx : -5.5124085660023581E-002
@@@ khet : -2.2778743127078089E-006
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 3.9044794089568130E-005
@@@ fc : 1.2736767530441284E-003
@@@ 1-fc : 0.99872632324695587
@@@ kk : 0.14056125872244526
@@@ tauc : 3600.0000000000000
@@@ ff : 1.2753010743756928E-003
@@@ xx : -5.5124085660023581E-002
@@@ khet : -2.2778743127078089E-006
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 7.7444165929824418E-005
@@@ fc : 0.11744181811809540
@@@ 1-fc : 0.88255818188190460
@@@ kk : 0.27879899734736791
@@@ tauc : 3600.0000000000000
@@@ ff : 0.13306977435489950
@@@ xx : -2.2385298085750183E-002
@@@ khet : -1.7733067392980751E-006
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 4.3817520052903658E-005
@@@ fc : 0.10060337185859680
@@@ 1-fc : 0.89939662814140320
@@@ kk : 0.15774307219045317
@@@ tauc : 3600.0000000000000
@@@ ff : 0.11185651436840825
@@@ xx : -4.0754996429750445E-003
@@@ khet : -1.7930906134716327E-007
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 4.2738640622205602E-005
@@@ fc : 0.10060337185859680
@@@ 1-fc : 0.89939662814140320
@@@ kk : 0.15385910623994017
@@@ tauc : 3600.0000000000000
@@@ ff : 0.11185651436840825
@@@ xx : -2.7993019017954479E-003
@@@ khet : -1.1997420198567723E-007
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 4.2738640622205602E-005
@@@ fc : 0.10060337185859680
@@@ 1-fc : 0.89939662814140320
@@@ kk : 0.15385910623994017
@@@ tauc : 3600.0000000000000
@@@ ff : 0.11185651436840825
@@@ xx : -2.7993019017954479E-003
@@@ khet : -1.1997420198567723E-007
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@ kI : 8.3063899387568819E-006
@@@ fc : 1.3083992525935173E-002
@@@ 1-fc : 0.98691600747406483
@@@ kk : 2.9903003779524776E-002
@@@ tauc : 3600.0000000000000
@@@ ff : 1.3257452941129854E-002
@@@ xx : -7.7188966513230817E-003
@@@ khet : -6.4614921383144082E-008
I have a larger log file (75 MB) if you'd liek to see the whole thing.
@viral211 thanks. I was sure that I had fixed that in the new code but I guess it fell thru the cracks. Sorry for the bother.
It looks like the bug in the xx definition was causing the negative values. I checked a few examples from the log excerpt that Bob sent. Can you confirm that the negatives are gone when xx is defined correctly?
I was able to run a 1-day fullchem_benchmark run and I didn't get the integration error. With the negatives the run bombed out before 00:30. So I think we are OK now.
I ran an internal 1-month benchmark to evaluate the impact of these changes (tagged as 13.3.0-alpha.4
). The benchmark plots and tables may be viewed at http://ftp.as.harvard.edu/gcgrid/geos-chem/validation/InternalBenchmarks/GCC_13.3.0-alpha.4/. The OH metrics are included below for quick review:
###############################################################################
### OH Metrics
### Ref = GCC_13.3.0-alpha.3; Dev = GCC_13.3.0-alpha.4
###############################################################################
------------------------------------------------------------
Global mass-weighted OH concentration [10^5 molec cm^-3]
------------------------------------------------------------
Ref : 11.05722637197
Dev : 11.79245633477
Abs diff : 0.73522996280
% diff : 6.649316
------------------------------------------------------------
CH3CCl3 (aka MCF) lifetime w/r/t tropospheric OH [years]
------------------------------------------------------------
Ref : 5.634232
Dev : 5.246796
Abs diff : -0.387436
% diff : -6.876466
------------------------------------------------------------
CH4 lifetime w/r/t tropospheric OH [years]
------------------------------------------------------------
Ref : 9.487550
Dev : 8.841421
Abs diff : -0.646129
% diff : -6.810283
Changes are considerably large, as suspected by Viral in the original post above. I will close out this issue for now since this will officially be benchmarked in 13.3.0, but @viral211, @cdholmes, or others, please feel free to comment here on any of the changes to model output.
Thanks @msulprizio. I was expecting a general increase in ozone because NOx loss on clouds is now much slower. The decrease in ozone over the southern ocean was unexpected. It is caused by the increase in Br, but I am not sure what drives it. I spoke with Xuan Wang and we suspect that it is because the loss of Br through the HOBr+S(IV) reaction in clouds is now slower (along with other HOBr cloud reactions) and more HOBr is available to react with HBr/Br-/Cl- on aerosols and release radicals. Xuan is looking into this further.
There is a typo in CloudHet in gckpp_Hetrates.F90. The line https://github.com/geoschem/geos-chem/blob/ba58da4a6e447e47714fe5e7da488d15432a6fdc/KPP/fullchem/gckpp_HetRates.F90#L1282 should be:
ff = min( ff, 1.0e30_dp )
I have run it by Chris Holmes @cdholmes. I think this would change the benchmark output significantly.