Open cmungall opened 2 years ago
The Bioschemas Dataset profile is defined over the existing schema.org Dataset type which itself is drawn from DCAT (version 2).
There are of course multiple ways you could model this, and that would be up to the deployer of the markup, i.e. as one Dataset with multiple parts which themselves are Dataset or as a collection of Datasets. In both cases, there would also be a DataCatalog which is the web site that makes the Datasets available.
:dc a DataCatalog ;
dataset :x1, :x2, ...
or
:dc a DataCatalog ;
dataset :x .
:x hasPart :x1, :x2, ...
I think that both of these are compatible with the proposed profile and it comes down to the markup developer's personal choice.
It seems that modeling this as one Dataset with multiple distributions would be discouraged though? Even if the cardinality of distribution is >1 (#575) it seems the intent is for distribution is to model an alternate serialization of the same data, rather than different parts of the dataset?
In my examples, I didn't get to the distributions. That would be added onto the Dataset using the distribution
property which should be many since it could be in different RDF serialisations or csv or a multitude of other formats.
To keep things semi-concrete, the markup would become
:dc a DataCatalog ;
dataset :x1, :x2, ...
:x1 a Dataset ;
distribution :x1csv, ...
or
:dc a DataCatalog ;
dataset :x .
:x hasPart :x1, :x2, ...
:x1 a Dataset ;
distribution :x1csv, ...
Let's say I have a directory full of files - perhaps the results of different genome annotation analyses all on the same sample
With frictionless, this might be represented as one DataPackage, with multiple DataResources
DCAT have a series of different examples of loosely structured datasets, e.g example 57 which is analogous:
https://www.w3.org/TR/vocab-dcat-3/#ex-elaborated-bag
here there is one "container" DataSet and multiple individual DataSets, each with their own serialization
Is bioschemas intended to be isomorphic to DCAT3? Should we use the same structure and link to the same documentation?
hasPart is in the profile but it has a very generic description:
Schema: Indicates an item or CreativeWork that is part of this item, or CreativeWork (in some sense). Inverse property: isPartOf
Or perhaps the container should be a catalog?