Open annethomas opened 7 years ago
@mdietze Can you clarify the difference between the goal/outcome of the changes we've been talking about and @dlebauer 's suggestion of populating the standard names field with the table here from 2014-2016: https://docs.google.com/spreadsheets/d/1oEiDasdTslsm0VXFPWAUKS82BYbzXLyv2CYiicXMEys/edit#gid=0 ? Are we just wanting to do a more thorough overhaul of names?
standard_name is about reconciling BETY names with names from other standards, most often CF. My requests have all been about reconciling BETY names with PEcAn output names, as are primarily focused on how load_data works. I strongly oppose using the standard name field to solve both problems -- right now we've got genuine cases of 3-way conflicts and standard name can only resolve one of these. I've been saying very consistently for over a year that in load_data the precedence is PEcAn output standard > BETY variable name > CF. @bcow and @mccabete created a look-up table to resolve conflicts between PEcAn output & BETY, and as we go we need to resolve PEcAn / BETY conflicts in names & units (and the sooner we do this the better) so we can eventually deprecate the look-up table. variables$standard_name resolves conflicts between BETY & CF.
Ok thanks. @mdietze @dlebauer @mccabete Here's my understanding for the context of this issue, see if it makes sense: So the table I started in the link at the top has a) variables that don't exist at all in either bety or pecan (a lot of them arising from the hierarchical pools framework @mccabete is putting together #1442 ), like overall litter_carbon_content , and b) bety variables that we want to add to the pecan pecan standard (currently for the sake of standard IC inputs) but question the names/units, like LeafLitter and Microbial Biomass C. And as long as we're adding variables to the pecan output standard we thought we might change the names to CF style. Right? But then there's the broader issue of conflicts between existing output standard names and bety, which I know less about. Also @dlebauer it looks like the BETYdb to CF table and the one I've made (link in description) are mostly complementary; the only overlapping variables are soilN/soilP and SOC. The rest of the ones in my table as it stands are finer carbon pools and fluxes like litter and woody debris.
@annethomas I think that summary make sense. Here's some feedback on some specific variables:
[surface/subsurface]_litter_carbon_content vs. [leaf/fine_root/fine_wood]_litter_carbon_content: there's a bit of ambiguity here -- one set of variables is defined by origin and the other by location, and currently the proposed hierarchy doesn't disambiguate these, which would leave most users pretty confused. For me I could see leaf litter and fine woody debris as nested within surface litter in the hierarchy, but they wouldn't necessarily sum to 100% (e.g. there's also reproductive litter). The next question would be whether subsurface and fine root litter are synonyms, or whether there are other subsurface litter components we're missing.
[fast/slow/structural]_soil_pool_carbon_content: Unlike @mccabete, I personally don't find the proposed addition of structural to be a conflict with CF, but rather just an extension. None of these are 'real' but they're pretty standard in CENTURY-style models so they'll be helpful output pool names as long as such models continue to be used. As we diversify the models considered we may need to acknowledge alternative hierarchies for SOC.
soil[N/P] vs soil_[nitrogen/phosphorus]_concentration: As proposed, the latter should be content, not concentration, same as with the soil C variables, and I personally don't see these as equivalent because of the units difference (per kg vs per m2, requires knowing bulk density and depth to move from one to the other). I think we can add the new variables without changing the old ones or conflicting with @dlebauer's proposed mass_fraction_of_nitrogen_in_soil (which is equivalent to soilN)
@mdietze Thanks for the feedback.
litter: Is it useful to have specific pools for things like reproductive litter (would models use this) or is there a way to lump things so we could safely say it sums to 100%, like other_litter? I can see how fine root and subsurface litter could be synonymous, although I was wondering how you would distinguish subsurface/root litter from soil/living roots in the field. Especially since e.g. FIA and NEON don't collect that, just defining litter as surface materials. Do models use it though? @mccabete http://data.neonscience.org/api/v0/documents/NEON_litterfall_userGuide_vA
nitrogen/phosphorus: To clarify, we want to add soil_[nitrogen/phosphorus]_content in kg/m2 as new variables and leave soilN/soilP as is with @dlebauer 's proposed standard names?
litter: I don't think having 'reproductive_litter' and 'other_litter' are mutually exclusive. Some models do have explicit seed production and dispersal, and that sort of data is not uncommon (e.g. this is exactly what Hannah's doing right now in our lab), but it should be noted that reproductive litter wouldn't be just seeds, but would also include flowers, cones, pollen, etc. This bit is implicit in some models (e.g. in ED2 there's a large fraction of undifferentiated reproductive biomass that goes straight to litter without distinguishing type, and then the remainder determines the density of new seedlings that year).
living/dead fine roots: while not always done, many people are capable of making this distinction observationally, especially when working with minirhizotron images. But perhaps our hierarchy of pools should acknowledge that not every dataset will distinguish living fine roots and dead fine root litter? FWIW, NEON was originally going to measure this, but it got descoped due to budget cuts.
N/P: yes, we should add new content (pool size) variables as these are complementary, rather than redundant with, our current concentration variables.
@mdietze Questions:
@mdietze Tcan doesn't seemed to be used in any model2netcdf or anywhere in the code (I checked some tables but feel like I need some psql refeshers to dig deeper). I can't find any easily accessible info about ED2's VegT except that it's AVG_VEG_TEMP in the model. Which one should we go with?
I'm fairly sure that ED2's AVG_VEG_TEMP is the leaf temperature and thus Tcan and VegT are equivalent
This issue is stale because it has been open 365 days with no activity.
There are several issues already discussing this goal, but hopefully this will serve as touchpoint for moving forward on establishing a more comprehensive output standard for variable names/units. Linking to issues #1487 #1442 #1415 @mdietze @mccabete @tonygardella
Description
We would like to align input/output variable names more closely throughout PEcAn. The MsTMIP table is the beginning of this but there are a lot of variables not included in this standard, and the relevant variables in BETY aren't always consistent in names and units. The proposal is to have a unified table in code and documentation, probably by renaming the mstmip_var.csv table to something more general, and to add useful variables using CF naming standards. This may involve renaming some BETY variables if possible, or else using their standard name field.
Here is a link to a table of additional variables proposed so far, to review if you'd like to give input on the decision: https://docs.google.com/spreadsheets/d/1ETasC8Nc0zGBjzo-wAY_Tyhd-IqSLKlvq854ErVypic/edit?usp=sharing
Notes
1) Many of the new variables reflect the hierarchy of carbon pools/fluxes being worked on by @mccabete. 2) In the table so far, I've focused on input variables for pool-based models and some variables immediately related (like the litter fluxes) but this is not yet exhaustive. @mdietze Let me know if you'd like Tess and/or me to work on fleshing out the whole hierarchy in this table right away. 3) For litter pools and fluxes, I populated with existing CF names and then divided into the subpools (leaf, root etc) and added corresponding pools/fluxes. I'm not sure how many of them we actually will need to define. 4) Layer-based soil variables: In Mstmip, TotSoilCarb has no layer-specific version so we want to add that (as an alternative to using CarbPools). Interestingly, SoilMoist only has a layer-specific version (unless SoilWet serves as the total, though it seems like a different measurement). 5) See the google sheet comments by each variable for more specific questions or conflicts.