Closed teixeirak closed 2 years ago
OK, what do you think is best, looking at MEASUREMENTS of ForC_simplified (and its column suspected.duplicate
)?
If using MEASUREMENTS table and there is potential duplicates, if they don't get resolved during review, the corresponding records won't make it through.
(same with ForC_simplified and suspected.duplicate
).
(I'll do MEASUREMENTS for now)
Should be fine, although we may need to add a simple mechanism to bypass the duplicate flagging for studies that we've carefully reviewed.
Ok I pushed this file. Let me know if that works and if it is a good place for it.
I only penalized potential site duplicate, not potential measurement duplicate....
But I manually checked the top 10 citations and non have conflicts except Powers_2012_csaa
that has 2 records that are replicates and Wang_2013_eofa
for which all records are replicates. But I think replicates are not as bad as duplicates and are "resolved" by taking average... Which, now that I am saying that, is probably not good for IPCC's review process...
Thanks! That's a good start.
A score of -3549... YIKES! (That one may be fully replicated between original ForC and SRDB.)
One small modification: could you please list the variables, rather than just giving the count? Some will get higher priority than others.
done (ignoring if in C or OM units).
Thanks! Let's leave it at this for now, but I'll keep this issue open for now because there's a good chance we'll want to adjust later.
We want to refine this system. Here are my notes on what I'd like to weight towards:
ForestGEO Studies included in GROA By region: tropics By variable: major stocks and increments
I need to come back and provide details.
@ValentineHerr , Here's the list above, edited (new criteria in bold):
provide.to.IPCC
=1.EFDB.ready
= 0review priority score
)@ValentineHerr , just a reminder to update prioritization based on this when you have a chance. Not urgent.
@teixeirak, FYI, I don't see any delta.biomass_root delta.deadwood delta.O.horizon
in our data
correct, but we want them prioritized when available.
@ValentineHerr ,
Let's create a systematic method for identifying studies (citationID) to be reviewed for contribution to IPCC.
It would be helpful to generate a spreadsheet with the following fields : citationID, n potential records, variables represented, sites represented, review priority score, ready to rerun and send (to be filled manually). We'll probably want to tweak this list.
Requirements (all sites listed in the spreadsheet should meet this)
provide.to.IPCC
=1.Prioritize based on: (assign points, put in
review priority score
)This is just a rough start. Please modify as you see fit, and leave flexibility to change as we go.