fill in table of records of ForC variables relevant to, and sent to, EFDB

teixeirak commented 2 years ago

@ValentineHerr (and @mawilliams99 ),

I'd like to have this table (or similar) in the paper:

I've created a template here. Can we create a script to fill this in? (Perhaps better to include in the ForC repo so it's automatically updated?)

We'll probably want to delete the rows with zero records.

On second thought, I think I'd also like to rearrange the order of the variables.

ValentineHerr commented 2 years ago

I can work on this, unless you want to do it @mawilliams99? let me know.

@teixeirak, can you confirm/fill in the following: n in Forc --> count what is in MEASUREMENTS (ignoring _C and _OM in variable name) n independant --> count in ForC_simplified n reviewed --> where to look? n sent to EFDB --> I need to think of best way to look at this. n posted to EFDB --> where to look?

teixeirak commented 2 years ago

That's all correct.

For n reviewed, I think we can use the EFDB.ready field in citations, and then count everything flagged there as reviewed. @mawilliams99 , do you agree that it's fair to say everything that's been reviewed is flagged in EFDB.ready? If we feel this category is tough to get at accurately, we can drop it.

for n posted to EFDB, the easiest is probably to check with Valentyna as to the status of everything we've sent. Again, this category is not essential if it seems tricky to get at.

teixeirak commented 2 years ago

@ValentineHerr , I've updated the order of variables in this table.

I also deleted the n posoted to EFDB column, as we agreed.

I do think we should probably delete rows with zero records, but they're all there for now.

teixeirak commented 2 years ago

@ValentineHerr , I realized there are some variables sent to IPCC that aren't yet in this table. I'll need to review and add those.

teixeirak commented 2 years ago

Okay, here are a few that aren't listed and how they should be handled:

ANPP_woody_stem and ANPP_woody_branch: group with ANPP_woody
ANPP_litterfall_1 and ANPP_litterfall_2: group as ANPP_litterfall

I think that's it(?), but please let me know if I missed any variable with data.

ValentineHerr commented 2 years ago

for n reviewed, am I counting in ForC_simplified or MEASUREMENTS?

teixeirak commented 2 years ago

Good question! ForC_simplified, I think. Except that I just realized that we need to clarify something with Valentyna-- we haven't sent all the records from some publications because they were deemed duplicates with records from other publications. So I'm not sure if we should be sending those (in which case it would be in measurements, but possibly require some adjustment to the code)

teixeirak commented 2 years ago

but go with ForC_simplified for now

ValentineHerr commented 2 years ago

ok.

FYI, if I look at stocks that are "provide.to.IPCC" in the variable mapping document, the following 3 do not appear in your table:

total.ecosystem_2
biomass_ag_understory
soil

(there are other variable that are "provide.to.IPCC" and do not appear in your C_variables.csv document, but they are not stocks)

teixeirak commented 2 years ago

Soil = SOM / SOC.

The other two are variables that I don't like. (Total.ecosystem is rare, and I don't understand why IPCC wants that but not NEE. And understory biomass isn't very meaningful. ). Have we sent any records of those? If so, I'll add to the table.

ValentineHerr commented 2 years ago

ok, I'll prepare another file which will allow to directly look at the variable name in ForC but pull a nicer looking name for the paper. It will also work on the grouping and ordering of the variable, so C_variables.csv can be automatically generated and respect the order of that other file.

Also, in your RMD file, @teixeirak, what are the pieces of code in these lines that are specific to the rules of the journal? I would like to be able to save the csv file without the empty lines so it is easier to edit/fix, and then work on the formatting in RMD. Right now I have something that looks like this (which I know is not good yet):

teixeirak commented 2 years ago

I'm not sure what formatting would work for the journal. The original template didn't use RMD, and I contacted the template author for help with formatting more complex tables. He said a minimally formatted table should be okay. I'm not sure if packing rows (inserting headers over rows) will even knit. (This template is helpful but sometimes hard to get it to do what I want. That's why all table columns are currently equal width. :-) )

teixeirak commented 2 years ago

Got it; thanks!

One little thing-- could you please add the sum of all variables in the bottom row (total)?

ValentineHerr commented 2 years ago

oops sorry, I thought I did. I'll add that soon

forc-db / IPCC-EFDB-integration

fill in table of records of ForC variables relevant to, and sent to, EFDB #35