NCAR / DART

Data Assimilation Research Testbed
https://dart.ucar.edu/
Apache License 2.0
184 stars 138 forks source link

CLM SIF forward operator can lead to NaN values during assimilation testing #664

Open braczka opened 2 months ago

braczka commented 2 months ago

Describe the bug

  1. List the steps someone needs to take to reproduce the bug.
    Western US test case based on Raczka et al., 2021, can lead to SIF NaN values during CLM-DART assimilation. Testing is being performed by @XueliHuo. Conditions that bring about NaN value are very rare (1 instance for 2 years of simulation), and SIF is a diagnostic variable, but a NaN can still interfere with inflation and other components of the assimilation run. The forward operator for SIF is external to DART and included in SourceMods based on CTSM pull request. See DART issue #338 for more details on original generation of SIF forward operator.

  2. What was the expected outcome? Expected that model generated SIF to be real values.

  3. What actually happened?
    Single grid cell provided a NaN value for SIF, traced back to shaded SIF calculation, which was traced back further to a slightly (non-physical) negative degree of light saturation that should always remain between 0-1.

Which model(s) are you working with?

Cesm2.2

Version of DART

Which version of DART are you using? 11.0.1, certainly well after latest CLM-SIF changes

Have you modified the DART code?

No -- but testing is being done with the external forward operator code, where degree of light saturation is forced between 0-1.

Build information

Please describe:

  1. The machine you are running on (e.g. windows laptop, NSF NCAR supercomputer Derecho).
    CHPC at University of Utah (kingspeak/notchpeak)
  2. The compiler you are using (e.g. gnu, intel).
    intel
XueliHuo commented 2 months ago

Hi Brett, thanks for opening this issue. This is a very good summary of the issue. More details on this issue and bug fix as follow.

Issue: The SIF 80-ensemble data assimilation ran from 2003-01-01 to 2010-12-31 which was a 8-year long model simulation. The NaN SIF value occurred in a gridcell indexed with (10,10) in ensemble 41, and the pft within this gridcell that had the NaN SIF value was indexed with 456.

Diagnosis: Put some printing sentences in the source code PhotosynthesisMod.F90 the where the SIF is calculated in SourceMods. The values of x (degree of light saturation) in the subroutine fluorescence at timestep 99244 are negative values ranging from -8.53E-005 to -3.27E-005, which results in the NaN values of x_alpha, leading to the NaN value of fs (fluorescence yield). Here are some related equations used in the subroutine fluorescence to calculate fs (fluorescence yield): *x = 1._r8 - ps / po0 x_alpha = exp(log(x) 2.83_r8) Kn = 2.48_r8 (1.0_r8 + 0.114_r8) x_alpha /(0.114_r8 + x_alpha)

fm = Kf / (Kf + Kd + Kn) fs = fm * (1._r8 - ps)**

Check the values of ps/po0 leading to the negative x at timestep 99244 in the attached log file. lnd_0041.log.523078.240404-170404.gz The printing sentences use GetGlobalIndex which is a CLM function to print out the values for the pft indexed with 456.

Bug fix: Add sentences in the subroutine fluorescence to prevent x_alpha to be NaN: x = 1._r8 - ps / po0 if (x <= 0._r8) then x_alpha = 0._r8 else x_alpha = exp(log(x) * 2.83_r8) endif

This prevents the NaN SIF from occurring.

braczka commented 2 months ago

Additional information on the this bug. Although the NaN SIF value occurred while running with CLM-DART, DART is unlikely to contribute to the bug (negative light saturation), although it can contribute to the timing of when it occurs. Open loop CLM-SIF simulations for Western US domain also show rare occasions of negative SIF values. @XueliHuo will confirm if CLM open loop also runs without NaNs when the light saturation fix mentioned above is implemented. Bug is likely self contained to the SIF forward operator, which is fully contained in cesm2_2 SourceMods.

The cesm2_2_0 SourceMods which contains the SIF forward operator needs to add these changes once more testing is done.