lter / lterwg-som

Soil Organic Matter Synthesis working group
https://lter.github.io/som-website/
8 stars 6 forks source link

Aligning data #60

Closed wwieder closed 5 years ago

wwieder commented 5 years ago

This follows up on closed #8,

I have renewed terror about how best to handle datasets that need to be aligned.
CDR and HRV are two huge sites with lots of manipulations and datasets where this will need to happen (KNZ is similar, but to a lesser extent).

@srearl you wrote a really slick script to handle this for NutNet (zipped up in that directory). Would similar pre-alignment of the raw data be preferable to trying to do so within the tarball?

piersond commented 5 years ago

How much will the summary tables for "variable n" by site help users with this? Or is this a larger problem for user workflow?

I still picture that we should only combine data rows if the data pertains to the exact same sample location and time. Or if one of the variables is static. How much of this we can accomplish I'm uncertain, but willing to try. Perhaps spending some time digging into the alignment intricacies of HVF is where to start...?

wwieder commented 5 years ago

This may be facilitated by aggregating data up to site level.

wwieder commented 5 years ago

Can we develop a generic align and summarize function that's extensible? Maybe we provide an example for how this can be done and keep responsibility on data users for how this gets done.

srearl commented 5 years ago

closed pending input during oct meeting