Open ljgarcia opened 1 year ago
I would go for about as its range Thing would make it possible to use Bioschemas types such as Taxon while also possible to use DefineTerms coming, for instance, from EDAM.
RO-Crate also use "about" (in use) and “keywords” like that. https://www.researchobject.org/ro-crate/1.1/contextual-entities.html#subjects--keywords
@ljgarcia Thanks for opening the discussion.
In our case the repository provides heterogeneous datasets, focused on plant research data, but without a specific data domain focus, because the aim was to provide a generic platform to share datasets, which are too large or not in the scope of existing database. We have genomic data, phenotypic images, metabolomics dataset, microscopy pictures, software and so on. That is why the general specification is “dataset”, but of course, all are related to plants and can therefore described with a "taxon".
I think I would prefer the solution to add the taxon content in the "about" section, because it looks more clear and the "keywords section is already used for the general dataset description.
Here is an example:
<script type="application/ld+json">{
"@context":"http://schema.org/",
"@type":"Dataset",
"http://purl.org/dc/terms/conformsTo":"https://bioschemas.org/profiles/Dataset/1.0-RELEASE",
"@id":"https://doi.ipk-gatersleben.de/DOI/b2f47dfb-47ff-4114-89ae-bad8dcc515a1/7eb2707b-d447-425c-be7a-fe3f1fae67cb/2",
"keywords":"barley, Hordeum vulgare, genome sequence assembly, long read sequencing, gene annotation, transposable elements",
"about": {
"@type":"Taxon",
"@id":"http://purl.bioontology.org/ontology/NCBITAXON/4513",
"http://purl.org/dc/terms/conformsTo":"https://bioschemas.org/profiles/Taxon/0.6-RELEASE",
"url":"https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=info&id=4513",
"taxonRank":"species",
"parentTaxon":"Hordeum",
"http://rs.tdwg.org/dwc/terms/vernacularName":"barley"
}
#The rest of the properties describing this dataset
}
</script>
Hi, I also totally agree with the "about" option, the illustration given by @arendd is very convincing that this is very appropriate.
@gtsueng this discussion is useful also for the "topic" and "organism" elements needed in the synthetic datasets. We could use about to describe the topic/subject of the Dataset (including the organism), see also https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/terms/subject.
Some options that were mentioned on emails and community calls:
What way would the community want to go? Please add your thoughts, pros and cons to help us find a community-based approach