marco-bolo / dataset-catalogue

The index for MBO datasets
Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

Full descriptions of variables #16

Open kmexter opened 11 months ago

kmexter commented 11 months ago

PLB sent via email a link to a dataset description json that is for EOVs - https://book.oceaninfohub.org/thematics/variables/index.html. This could also apply to other variables measured that are not EOVs - the raw datasets leading to the EOVs for example.

Raising this issue because we need to decide how much of the metadata in that json we want to collect from MBO. I can see

Clearly there is a lot of useful metadata here, but we do need to work together to decide what to do for MBO. I would say that perhaps it is not necessary to decide this right now, before sending out the instructions to data providers in MBO to start describing their datasets - see issue#15 where we say that perhaps we can gather free text info for now and tackle the standardisation later.

pieterprovoost commented 11 months ago
kmexter commented 11 months ago

EOV specs sheet - how does that differ from an SOP, which you would put in "measurementTechnique", no?

Linking to sampling events would be good, but given that there have been dozens and dozens of these for the VLIZ data that is being described in that googleshee, I am not sure that this would be something the VLIZ people would be happy adding, mainly because of the effort involved. But we can make this optional?

pieterprovoost commented 11 months ago

Regarding links to samples, this overlaps with the provenance part, right?

kmexter commented 11 months ago

yes, it does. perhaps this is a part of the dataset description that we should tackle in Feb, during and after the GA?

kmexter commented 3 months ago

We have starting working on this - adding provenance to our OIH metadata template - but most of the work on this will come later in 2024, early 2025. Only when we have more info from the WPs on the parameters they are collecting, can we usefully look for vocab terms for them (if not already provided). I have big doubts about how much prov info will be provided for the source datasets being used by the WPs, but we ought to be giving them templates where they can keep the metadata they need to keep to record what they are doing with these source data to calculate indicators.