biocaddie / WG3-MetadataSpecifications

WG3 Metadata Specification
28 stars 13 forks source link

"Set" information isn't in the current metadata model #15

Closed RuilingLiu closed 8 years ago

RuilingLiu commented 8 years ago

Below is an example from Dryad repository:

"record": { "header": { "identifier": {"$": "oai:datadryad.org:10255/dryad.149"}, "setSpec": {"$": "hdl_10255dryad.148"}, "repository": {"$": "Dryad Data Repository"}, "setName": {"_$": "BIRDD"}
} }

BIRDD (Beagle Investigations Return with Darwinian Data) is a collection of data relating to Galapagos finches. It spans multiples publications from multiple researchers, but all data has been converted into standardized formats for easy comparison.

Link to the dataset: http://datadryad.org/handle/10255/dryad.149 Linke to the set: http://datadryad.org/handle/10255/dryad.148

This set information isn't in the current metadata model.

aegururaj commented 8 years ago

This seems to be similar to the Series concept in GEO

agbeltran commented 8 years ago

We are going to support sets or collections of Datasets by adding a relationship 'hasPart' to the Dataset entity. This means that a collection will be those Datasets declaring the 'hasPart' relationship. This an economic way to support having all the descriptive elements associated with the composite Dataset (creation date, creators, title, description, etc). While we are not adding a field for aggregation criteria, this could be addressed in the description field. Similarly, we are currently not adding 'curation status' or similar for simplicity (we didn't have use cases towards this end), but it could be added if needed (e.g. to distinguish curated vs not series in GEO). Let us know if this approach would cover the use cases you had in mind.

agbeltran commented 8 years ago

Closed via https://github.com/biocaddie/WG3-MetadataSpecifications/commit/f56201416b02deb64823626e9d284743d6956a66