Open mankoff opened 3 years ago
Also, how should we handle the metadata collection? A combination of manual entries and more automated extractions through e.g. metadata extractors? Automation is nice, but we need to stick to the chosen convention.
I'd love to automate this, but we as a community are not providing the right metadata to do so. I think we'll need to do it manually. Given that we'll start with 5 or 25 datasets, that's easy to do. Even if we grow to a few 100, manual is solvable at the dataset level. It would be nice if 3rd parties provided sufficient file-level metadata that we could fetch, for example, specific velocity maps based on date or time, or Landsat scenes based on cloud cover %. But for now, just data-set level stuff.
What do we want to track?
Maybe keywords from a controlled vocabulary, for example, "velocity" "biology" "ice" "ocean" "atmosphere" "temperature" etc.?
Why not JSON? Or any other human-readable object notation. We could have nested objects like for the ROI: the name would be the standard field to add, then one could nest another object into it for the polygon.
Datalad supports metadata: http://docs.datalad.org/en/stable/metadata.html
Which format should we use? Pro's and cons here...
For @jmlea16 and all.