Open isanti opened 2 years ago
Then additionally, the provenance for biological material and its digital "derivatives", we will need provenance information following the EMBRC "provenance model" that we are building in WP6. This will cover the metadata necessary for each spreadsheet from a single station/sampling event, the digital files (e.g. the sequences, the ARMS images), also the biobanked material (especially if the stations don't do this properly!). Since Laurian and I will not have time to put this model together until Oct/Nov, I think that for this part of the provenance, we will have to wait until then. What we can do before then, perhaps, is decide how we will store these metadata. Ideally not as CSV files (data.csv and metadata.csv), because that is just too clunky for the amount of digital data that will need to be managed. We will need to create a template that can be (ideally) automatically filled, and which L and K can do as part of our EMBRC prov model work.
Can be made into an action that can be applied to a github repo , doesn't matter if the repo is a RO-Crate or not. @marc-portier thoughts?
prov-o link: https://www.w3.org/TR/prov-o/
Still on my list of things to do end nov
According to what I can remember about the common provenance model of EOSC Life WP6, the recommendations are that the following needs to be provided for each file as a digital object:
These provenance information can be packaged in a prov ro-crate we can create, but/also written in prov-o following the CPM of WP6 (my notes about this can be found on confluence: https://confluence.vliz.be/display/VMDCOS/2022-07-08+Vienna+ISO+pt+3+meeting and https://confluence.vliz.be/display/VMDCOS/Reading+on+provenance+in+marine+biology and 2 papers that I am not allowed to share digitally but which I have printed out and on my desk :-})