nfdi4plants / isa-ro-crate-profile

MIT License
4 stars 0 forks source link

Integration of Data Fragment Selectors #21

Open floWetzels opened 5 months ago

floWetzels commented 5 months ago

Data Fragment Selectors for data outputs of processes can be represented in the RO-Crate by adding the data format and data selector format information to the json object representing the output.

The output is a schema.org/MediaObject, with its @id being the URL of a file or a directory containing the data selector. Information on how to interpret the data selector (given in the DataSelectorFormat column as a URL) is added through the usageInfo property. Information on the data format (given in the DataFormat column as a URL) is added through the encodingFormat property.

Example

For example, consider the following process table:

Input [Sample Name] Output [Data] Data Format Data Selector Format
input1 result.csv#col=1 text/csv https://datatracker.ietf.org/doc/html/rfc7111
input2 result.csv#col=2 text/csv https://datatracker.ietf.org/doc/html/rfc7111

The corresponding jsonld objects should look like this:

{
  "@id": "#some_process_id",
  "@type": "LabProcess",
  "object": ["#Sample_input1","#Sample_input2"],
  "result": ["result.csv#col=1","result.csv#col=2"]
},
{
  "@id": "#result.csv#col=1",
  "@type": "MediaObject",
  "encodingFormat": "text/csv",
  "usageInfo": "https://datatracker.ietf.org/doc/html/rfc711"
},
{
  "@id": "#result.csv#col=2",
  "@type": "MediaObject",
  "encodingFormat": "text/csv",
  "usageInfo": "https://datatracker.ietf.org/doc/html/rfc711"
},
{
  "@id": "#Sample_input1",
  "@type": "Sample"
},
{
  "@id": "#Sample_input2",
  "@type": "Sample"
}
HLWeil commented 1 month ago

@floWetzels