ESCOMP / CTSM

Community Terrestrial Systems Model (includes the Community Land Model of CESM)
http://www.cesm.ucar.edu/models/cesm2.0/land/
Other
308 stars 312 forks source link

Modified NEON surface datasets have errors #1429

Closed wwieder closed 2 years ago

wwieder commented 3 years ago

Brief summary of bug

It looks like a few sites that use the modified NEON surface dataset end up with errors in the soil sand, clay, and ORGANIC profiles.

General bug information

A quick look at some sites suggest this is an issue for HEAL, PUUM, TOOL and maybe others? We don't even have met data for most of these sites, so it's likely not a big issue at this point, but something to be aware of as we start running more sites

@negin513 confirmed this is not an issue with how we're handling zbedrock, but results from issues in the raw data from NEON.

Does this bug cause significantly incorrect results in the model's science? YES

@ekluzek I don't want to delay bringing in #1375 and #1411, so let's bring in these PRs and @negin513 can address this later.

Configurations affected: NEON tower simulations

negin513 commented 3 years ago

Will, thanks for bringing this up to my attention. It seems like the issue of missing data in the updated surface dataset is not related to zbedrock and is not caused by any bugs in the code. The issue is caused by missing data in the provided neon surface dataset.

As we discussed earlier, you made an issue about this on NEON repo.


I am pasting part of the email I sent earlier here for future reference:

I've included all the updated information (PCT_CLAY, PCT_SAND, ORGANIC, and zbedrock) for each site in one plot.

The plots for all neon sites are in the following shared folder: https://drive.google.com/drive/folders/1oUXdncv-2ilwpY-39L-HRkqm3gzJB2uP?usp=sharing

These plots reveal some issues with data from neon sites:

For example, for the TOOL site the plots show that the values of PCT_CLAY, PCT_SAND, and ORGANIC do not exist for the first layers after updating. (Thanks to Will for bringing this to my attention.) At first glance, we thought these values are not correct due to a possible bug in zbedrock code; however, upon further investigation, we noticed the issue is not with the code and the data provided by neon has some issues.

For example, in the neon data for the TOOL site , values of clayTotal, sandTotal, bulkDensExclCoarseFrag( needed for calculating ORGANIC) are missing for some of the layers.

negin513 commented 3 years ago

@wwieder: Since this does not seem to be caused by any bugs related to the code, should we close this issue and continue the discussion in the corresponding NEON issue?

wwieder commented 3 years ago

@negin513 let's leave this open to discuss how we want to handle this on the NCAR side. I modified the title of this issue to reflect what's wrong.

It seems like there are several things going on here:

  1. zbedrock errors, see #1446
  2. Modified calculations for ORGANIC, using the new field from NEON estimatedOC. The new calculation should be
  3. Missing values,
    • For missing surface soil texture data (mainly from organic soil horizons where NEON didn't collect soil texture data), I suggest we just extend the uppermost observations to the surface layers where data are missing.
    • There are other sites with missing data that occurs mid profile. For these we can either pick the data above / below that's not missing, or interpolate in between. This likely will require a more careful look
    • Alternatively, I wonder what others think about @ekluzek's suggestion to fill in missing data from NEON with the default values from the orig. surface dataset
  4. Organic horizons, especially in some forest sites this gives really large ORGANIC surface soil layers (e.g. HARV or ORNL). Do we leave this feature, or replace O horizon data with A horizon data for consistency? I suggest we leave these data and evaluate impacts on moisture and temperature dynamics with the two surface datasets.
  5. Crazy values, especially at depth (e.g. ONAQ). These will take a case-by case evaluation to identify potential issues with raw data (e.g. bulk density) in the data NEON are providing, but it's likely important to QC the NEON data.
wwieder commented 3 years ago

This sheet has a summary of good, bad, and suspect data

dlawrenncar commented 3 years ago

Note that the O-horizon is 'applied' through organic matter content. Sand/silt/clay texture is generally always set for all layers, but will not be used if organic matter is at the maximum value.

On Fri, Jul 16, 2021 at 6:57 AM will wieder @.***> wrote:

@negin513 https://github.com/negin513 let's leave this open to discuss how we want to handle this on the NCAR side.

It seems like there are three issue here

  1. Missing values, I wonder what others think about @ekluzek https://github.com/ekluzek's suggestion to fill in missing data from NEON with the default values from the orig. surface dataset
  2. Organic horizons, especially in some forest sites this gives really large ORGANIC surface soil layers (e.g. HARV or ORNL). Do we leave this feature, or replace O horizon data with A horizon data for consistency? I suggest we leave these data and evaluate impacts on moisture and temperature dynamics with the two surface datasets.
  3. Crazy values, especially at depth (e.g. ONAQ). These will take a case-by case evaluation to identify potential issues with raw data (e.g. bulk density) in the data NEON are providing, but it's likely important to QC the NEON data.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ESCOMP/CTSM/issues/1429#issuecomment-881427337, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFABYVAVLCYWNX6X7RCGHULTYAUCFANCNFSM5AOAIDTA .

wwieder commented 3 years ago

Thanks for clarifying @dlawrenncar. it doesn't look like the HWSD and NCSCD used for ORGANIC typically included highly organic 'litter' layers in model. Do you agree with the suggestion of # 2 above, that leave in the high organic values from NEON observations and evaluate simulated temperature and moisture profiles?

dlawrenncar commented 3 years ago

Yes. I agree with that idea to compare the two. When the organic matter being referred to is actually a litter layer, CTSM doesn't really have a mechanism to account for this presumably seasonally varying situation.

On Mon, Jul 19, 2021 at 10:17 AM will wieder @.***> wrote:

Thanks for clarifying @dlawrenncar https://github.com/dlawrenncar. it doesn't look like the HWSD and NCSCD used for ORGANIC typically included highly organic 'litter' layers in model. Do you agree with the suggestion of # 2 above, that leave in the high organic values from NEON observations and evaluate simulated temperature and moisture profiles?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESCOMP/CTSM/issues/1429#issuecomment-882679277, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFABYVGW646KAVXYMH6OHLLTYRFYVANCNFSM5AOAIDTA .

wwieder commented 3 years ago

more work is needed here to fill gaps in data and correct organic calculation.

wwieder commented 3 years ago

The notes on @jedwards4b's latest sheet with the status of sites still points to a surface data issue with KONA:

@negin513 and @danicalombardozzi any idea why this run may be failing?

wwieder commented 3 years ago

The other surface data issue is with STER, which prints the following ERROR: surfrd_veg_all ERROR: sum of wt_nat_patch not 1.0 at nl=1 sum is: 0. Is this supposed to be an ag site? If so, what do we need to do to correct this error?

wwieder commented 3 years ago

@negin513 I'll add this note here

wwieder commented 3 years ago
ekluzek commented 2 years ago

@negin513 and @wwieder I think all of these are taken care of in @negin513 contributions in ctsm5.1.dev067. Could you double check and close this if so?

wwieder commented 2 years ago

Yes, it looks like the issues above were resolved by @negin513 in #1539.