Full descriptions of variables

kmexter commented 11 months ago

PLB sent via email a link to a dataset description json that is for EOVs - https://book.oceaninfohub.org/thematics/variables/index.html. This could also apply to other variables measured that are not EOVs - the raw datasets leading to the EOVs for example.

Raising this issue because we need to decide how much of the metadata in that json we want to collect from MBO. I can see

variable URL ("propertyID")-> from which one can normally get the name and description as well
variable value and unit ("value" and "unitCode")-> do we want to do this as well?
"measurementTechnique" -> ideally a link to an SOP, my question here is how to deal with people who say that the SOP is not available online (this could be the case more for raw datasets)?
it is not entirely clear to me what is being described in the "publishingPrinciples" section.
ditto the "about" section

Clearly there is a lot of useful metadata here, but we do need to work together to decide what to do for MBO. I would say that perhaps it is not necessary to decide this right now, before sending out the instructions to data providers in MBO to start describing their datasets - see issue#15 where we say that perhaps we can gather free text info for now and tackle the standardisation later.

pieterprovoost commented 11 months ago

Values are not to be included here.
publishingPrinciples links to the EOV spec sheet.
about links to sampling events. See also here where the dataset links to the cruises: https://github.com/marco-bolo/dataset-catalogue/blob/main/datasets/biogoship/biogoship.json#L60.

kmexter commented 11 months ago

EOV specs sheet - how does that differ from an SOP, which you would put in "measurementTechnique", no?

Linking to sampling events would be good, but given that there have been dozens and dozens of these for the VLIZ data that is being described in that googleshee, I am not sure that this would be something the VLIZ people would be happy adding, mainly because of the effort involved. But we can make this optional?

pieterprovoost commented 11 months ago

Regarding links to samples, this overlaps with the provenance part, right?

kmexter commented 11 months ago

yes, it does. perhaps this is a part of the dataset description that we should tackle in Feb, during and after the GA?

kmexter commented 3 months ago

We have starting working on this - adding provenance to our OIH metadata template - but most of the work on this will come later in 2024, early 2025. Only when we have more info from the WPs on the parameters they are collecting, can we usefully look for vocab terms for them (if not already provided). I have big doubts about how much prov info will be provided for the source datasets being used by the WPs, but we ought to be giving them templates where they can keep the metadata they need to keep to record what they are doing with these source data to calculate indicators.

marco-bolo / dataset-catalogue

Full descriptions of variables #16