Clean inventory initialization files

JessicaNeedham commented 1 year ago

This PR address issue #1062 and removes unnecessary columns from the inventory initialisation files. As discussed in #1062 these extra columns are confusing to new users. As part of work on the NGEET FATES tutorial I want to simplify running with inventory data.

Specifically, I removed water, fsc (fast soil carbon), stsc (structural soil carbon), stsl (structural soil lignin), ssc (slow soil carbon), psc (passive soil carbon), msn (mineralized soil nitrogen), and fsn (fast soil nitrogen) from the patch file. From the cohort file I removed avgRG (average radial growth), balive (live biomass per individual of the cohort), bdead (the dead biomass per indiv of the cohort), cohort (unused string) and height.

I was a bit unsure of height - the code currently doesn’t use it and instead calculates height from DBH. It would be possible to update it to handle a case where the file only had height, and use the height to calculate DBH. But I’m not sure how many sites would have only height - maybe remote sensing data? I’ve removed it for now but could add it back in if people think it might be useful.

As suggested by @adrifoster I made a python script that takes an ED file (pss or css) and creates a new file in the new FATES format. The new file will have _fates appended to the file name.

Collaborators:

@adrifoster identified the issue and suggested a path forward. Discussed with @glemieux.

Expectation of Answer Changes:

This should be bfb in runs with inventory initialisation both on and off as the only information removed from the inventory files was not being used anyway.

Checklist

If this is your first time contributing, please read the CONTRIBUTING document.

All checklist items must be checked to enable merging this pull request:

Contributor

[X] The in-code documentation has been updated with descriptive comments
[X] The documentation has been assessed to determine if updates are necessary

Integrator

[x] FATES PASS/FAIL regression tests were run
[x] Evaluation of test results for answer changes was performed and results provided

Documentation

Technical Note update: no updates needed
User's Guide update: https://github.com/NGEET/fates-users-guide/pull/48

Test Results:

Test suite not run but code compiles and runs with inventory init on Perlmutter.

CTSM (or) E3SM (specify which) test hash-tag: b8be65d

CTSM (or) E3SM (specify which) baseline hash-tag:

FATES baseline hash-tag: a3048a6

Test Output:

JessicaNeedham commented 1 year ago

After discussion at the software meeting today I have added height back into the cohort file. It isn't currently used but might be useful in future cases where people use lidar to initialise a run.

JoshuaRady commented 1 year ago

@JessicaNeedham, I agrees with the decision to retain the height despite the potential for confusion (which is real). An additional reason to keep it is that small trees are sometimes measured by height.

JessicaNeedham commented 1 year ago

Thanks @rgknox for adding in functionality to initialise with a cohort file that has only height but no dbh measurements. Now either height or dbh needs to be negative in the cohort file, and the positive one will be used to calculate the other.

It works fine in my testing except in a specific case where I set dbh negative and all the height values to be identical. In that case the age, and the size structure of all the patches were equal, and somehow the sum of all patch areas was decreased during patch sorting (somewhere here) triggering fails in update_hlm_dynamics.

If people want to initialise a plantation with all trees the same size and all patches the same age, they could have one big patch. But I think FATES should handle a case where someone uses subplot info for patches and tree info for cohorts. I wonder about putting a check here, so that the original order in which patches are read in is used if patches are identical. The patches are fused right after the cohorts are fused, so it shouldn’t matter. Does that sound reasonable @rgknox and @mpaiao ?

JessicaNeedham commented 1 year ago

I ran some 10 year tests using BCI inventory initialisation data - both dbh, and then calculating height from dbh (outside of FATES) and using height to initialise. Also a 10 year run with all cohorts starting at 5 m hight. Results are here. They mostly look as expected except that the two runs using BCI data (from height and dbh) are slightly different. Maybe we wouldn't expect them to be bfb?

Screenshot 2023-11-02 at 16 53 57

JoshuaRady commented 1 year ago

I disagree with the requirement that only DBH or height be used and that DBH or height be negative as the way to select this behavior.

Inventory initialization should handle real inventories, which may contain DBH and/or heights for different trees. It is common for large trees to be missing heights and for small trees to be missing DBHs. The better behavior would be to use a preferred measure (probably DBH by default) and use the other when it is missing. A setting parameter could be provided to choose the primary measure. Otherwise one missing DBH/height in an entire dataset prevents its accurate import.

rgknox commented 1 year ago

@JoshuaRady , I believe we have it setup in the way you desire. Any plant can be described by dbh or height, it does not need to be consistent throughout the dataset. As long as, for each plant or cohort, one and only one of either dbh or height is positive. But for each plant or cohort in the dataeset, it must be one of the two, not both. My motivation for promoting a negative to specify a not-applicable instead of a separate field, is that we may want to use this dataset convention for large grids on HPC machines, so there would potentially be lots of reading and writing involved, and thus cut down on any absolutely unnecessary fields.

JoshuaRady commented 1 year ago

@rgknox, thanks for the clarification. That seems like a reasonable compromise. I was only considering the use of inventory initialization for single points but if it is to be used on larger scales that makes sense. The cost of adding the needed logic when converting inventory data into the FATES import format will be small so I can get on board with this.

glemieux commented 1 year ago

Regression testing on cheyenne is underway.

glemieux commented 1 year ago

All expected regression tests are returning B4B against fates-sci.1.68.3_api.31.0.0-ctsm5.1.dev153.

Results: /glade/u/home/glemieux/scratch/ctsm-tests/tests_pr1104-fates

NGEET / fates