Closed andrecastro0o closed 2 weeks ago
Here are some ideas, courtesy of ChatGPT (edited by me)
These datasets are collected directly at the location of interest. Examples include:
These datasets are collected from a distance, typically using satellites or ground-based remote sensors:
Datasets that include spatial information about the Earth's surface, sub-surface and atmosphere. Examples include:
Datasets that provide information about past climates using proxy data or historical records. Examples include:
thank you @mschleiss and @chatgpt :laughing: looks like quite a complete list and at a level of granularity that seems to be at a sweet-spot, detailed enough but not over complex. I will let you know when it is implemented in the catalog!
By the way, I am not a big fan of the term "data provenance" and would suggest alternative names such as "data type" or "source"
I have included the top 7 types as possible values of a Dataset type.
If wished the sub-types can be included later, but I think this is enough for now
@mschleiss you mentioned yesterday some terms that could be used to describe generally the provenance or type of a dataset, for instance
I wanted to check with you (@mschleiss) and @fjansson whether you have more or other suggestions for terms that can describe a dataset's provenance. Specially in the model side of things, which I very unfamiliar with, datasets might not only hold model output data, but perhaps "model training data" or similar.
You input will help help here.
And BTW, this list does not have to be exhaustive or final. We can add and change it, as we go on.
(@fjansson for a bit more of context: here is the link to an example catalog dataset https://ruisdael-catalog.citg.tudelft.nl/index.php?title=Micro_Rain_Radar_(Metek)_at_Westmaas it is still a early prototype, I am still working on the schema and form interfaces, but should give you an example:)