Open HLWeil opened 1 year ago
See also https://www.w3.org/TR/annotation-model/#selectors on how fragment selectors are different for different media types. You need to indicate the type of pointer
, either as a prefix or pointertype
. The media type of filename
will then also be essential (equivalent to encodingFormat
in RO-Crate for IANA Media type) so the client can know how to resolve the pointer.
I agree with Stain! We need a pointerType and encodingFormat
Preface
Hey, here are our proposed adjustments to the datamodel and documentation for enabling ISA to thoroughly describe data objects. For reference, here is the discussion about this topic: https://github.com/ISA-tools/isa-api/discussions/484
General Goals
Our goal here is to improve the description of
data
using the isa model.Currently, the description given in the ISA model points just to the file, but not inside the file. This is not sufficient, if the file format is not well understood or when the actual data object resulting from a measurement or computation is not a full file, but rather a value or value set in a file.
So we wanted to enhance the data object with two things:
Pointer
pointing to a specific location in the fileDataset
description, which gives context to the data objects stored in a data fileChanges made
Datamodel
We came up with the following data model:
Property | Datatype | Description -- | -- | -- File name | String | A file name or full path referencing a data file produced by the related process that MAY be packaged with, or is accessible via, the ISA reference implementation content. Pointer | String | A pointer referencing a location inside the data file. This SHOULD always be specified when the data of interest is not the complete file, but a specific part of it. Generated By | String | A file name, full path or identifier referencing the tool with which this data object was generated. Explication | Ontology Annotation | An ontology annotation qualifying what the data describes. Unit | Ontology Annotation | The unit qualifying the value stored in the data object. Object Type | Ontology Annotation | Specifies the format in which the value in the data object will be stored.ISA Json
Which results in the following json schema:
ISA Tab
To integrate these model extensions into the ISA Tab Format, we propose two adjustments:
To enable processes to point into files data files, we propose to add a new column
Data Pointer
to theAssay file
. This column should be used to qualify theData File
column, when the data object resulting from the process is not the full data file, but instead a value or value set in the data file.Additionally, to give context about the values in the data file, we propose to add a new file to the isa tab family, namely the
Dataset
file, which carries all other data fields, which we added in theData Model
.Aux
Open Questions