Notes from mtg with SP, SC, RR

We went over the current synoptic site data workflow (Powerpoint attached), looking at how data flows between levels (Raw, L0, L0_normalize, L1a).

Minor note: include original units in L0_normalize

Talked about the idea of L1a using ‘output templates’ to generate its outputs

Concern from Roy: sin of omission, want to be able to QAQC all outputs in one place. Ben: we will pass all through in a single table, too #36
Do we like this enough to merge and make the default way forward? #30

A lot of concern about what explicit guarantees are made for each data level, in particular L1a, which currently has ONLY out of bounds checking but nothing else.

This is complicated, tricky problem; philosophical differences
Tension between more QAQC and more features versus making standardized data—even if imperfect and limited—available sooner for users
Opportunity cost of data not being available!

How do we handle versioning?

Ben showed one way based on hash, Roy uses dates which is more intelligible for users
Do this?

We could assign a globally unique row identifier to everything, letting QAQC be done later

Do this?

Need to resolve how we handle metadata. Complicated but also straightforward, hopefully #31

Idea of a paper down the line; there’s also the ESS-DIVE community workshop in November

COMPASS-DOE / data-workflows

Notes from mtg with SP, SC, RR #37