nfdi4plants / arc-to-rocrate

1 stars 1 forks source link

How to represent ARC datamap in RO-Crate #10

Open HLWeil opened 5 months ago

HLWeil commented 5 months ago

Data entities of the datamap file should be of type schema.org/PropertyValue. Explanation see https://github.com/ESIPFed/science-on-schema.org/blob/master/guides/Dataset.md#variables.

For general description of a data entity (which can also be a fragment of a file), we use the MediaObject type. As the datamap adds further description to exactly this kind of data entity, we suggest to represent the entities of the datamap as both PropertyValue and MediaObject.

By this, fragment selectors (data localization) would be standardized between reference from the process and the datamap. Also, it could be interpreted according to Science-On-Schema.Org.

For visual reference, consider: image

Now two questions arise:

  1. How to depict datamap fields in PropertyValue?

  2. How to reference PropertyValue from datasets (Assay/Study)?

    • ScienceOnSchemaOrg recommends using the variableMeasured property. This would require the data entities to be directly linked from the dataset, not making use of the hasPart property.
    • RO-Crate recommends using hasPart to reference data files. In this case the dataset would reference the files and the files themselves could reference the data entitities (fragments), again using hasPart (-> CreativeWork). This would make linking down from folders to files to fragments very streamlined. But this might cause problems with cases like in BioImaging, where we might have fragments of folders, not files.
floWetzels commented 5 months ago

The current plan is to use MediaObject for data fragments and annotate them through the variableMeasured property in the assay/study Dataset object. Specifically, we plan the following:

kMutagene commented 1 month ago

@floWetzels @HLWeil

Are all the questions from this issue solved? There is a datamap profile. Is that profile ready to be implemented or are there still things that need consideration?

If that profile is ready to be implemented, that would be great as that would unblock https://github.com/nfdi4plants/ARCtrl/issues/432