Creator Name: John Davidson
Creator Affiliation: Contractor, Department of the Interior/OCIO/CDO
Requirement(s)
DOI EDI requirement: Need for a taxonomic reference that is dereferenceable back to a classification system
Metadata for dataset resources must identify/ cite the vocabularies used for “theme”, “genre”, “category” fields (or more generally any property of type dcterms:subject with range skos:Concept, owl:Class or similar ) that references a term in a controlled vocabulary.
The set of such "subjects" used to categorize the dataset are organized in one or more controlled vocabularies (e.g., skos:ConceptScheme, skos:Collection, owl:Ontology or similar) describing all the subjects (theme/ genre/ categories) and their relations. [DCAT-3 VOCAB]
Problem Statement
Keywords used to "tag" or otherwise label a dataset are not formally defined as semantically "grounded" concepts and are therefore ambigious and not machine-understandable.
DCAT requires use of semantically grounded concepts (properties with range skos:Concept or owl:Class or similar) such as are formally defined by a "controlled vocabulary" resource that is itself formally defined, online, open, and machine-understandable.
Controlled vocabularies must be explicitly referenced/ identified/ cited as the source of grounded concepts used to classify dataset resources (e.g., using properties such as theme, genre, and category).
Target Audience / Stakeholders
User 1: Data consumer/ analyst
User 2: Data steward/ curator
Existing Approaches - Optional
Use Case 1: As a data consumer/analyst I want to find all datasets that are categorized by terms from a specific and trusted controlled vocabulary so that I can be unambigiously assured of the specific purpose and meaning of the datasets.
Use Case 2: As a data analyst I want to know which domain-specific controlled vocabularies were used to categorize the dataset resource and to know they are authoritative or generally accepted by a community of interest, and are themselves FAIR (findable, accessible, interoperable, reusable)
Use Case 3: As a data analyst I want to know that a dataset references controlled vocabularies that are authoritative or generally accepted by a community of interest, and are themselves FAIR (findable, accessible, interoperable, reusable) as part of my assessment of the dataset's fitness for use (as a trusted resource).
Use Case 4: As a data steward, I want to ensure all the dataset resources for which I am responsible have been correctly and consistently categorized using specific controlled vocabularies and according to law, policy, govt. regulation and/or by local communities of practice.
Existing Approaches - Optional
Additional context, comments, or links - Optional
A requirement of the DOI's "Application Profile of DCAT-US 1.1" metadata specification.
Creator Name: John Davidson Creator Affiliation: Contractor, Department of the Interior/OCIO/CDO
Requirement(s)
Metadata for dataset resources must identify/ cite the vocabularies used for “theme”, “genre”, “category” fields (or more generally any property of type
dcterms:subject
with rangeskos:Concept
,owl:Class
or similar ) that references a term in a controlled vocabulary.The set of such "subjects" used to categorize the dataset are organized in one or more controlled vocabularies (e.g., skos:ConceptScheme, skos:Collection, owl:Ontology or similar) describing all the subjects (theme/ genre/ categories) and their relations. [DCAT-3 VOCAB]
Problem Statement
skos:Concept
orowl:Class
or similar) such as are formally defined by a "controlled vocabulary" resource that is itself formally defined, online, open, and machine-understandable.theme
,genre
, andcategory
).Target Audience / Stakeholders
User 1: Data consumer/ analyst User 2: Data steward/ curator
Existing Approaches - Optional
Use Case 1: As a data consumer/analyst I want to find all datasets that are categorized by terms from a specific and trusted controlled vocabulary so that I can be unambigiously assured of the specific purpose and meaning of the datasets. Use Case 2: As a data analyst I want to know which domain-specific controlled vocabularies were used to categorize the dataset resource and to know they are authoritative or generally accepted by a community of interest, and are themselves FAIR (findable, accessible, interoperable, reusable) Use Case 3: As a data analyst I want to know that a dataset references controlled vocabularies that are authoritative or generally accepted by a community of interest, and are themselves FAIR (findable, accessible, interoperable, reusable) as part of my assessment of the dataset's fitness for use (as a trusted resource). Use Case 4: As a data steward, I want to ensure all the dataset resources for which I am responsible have been correctly and consistently categorized using specific controlled vocabularies and according to law, policy, govt. regulation and/or by local communities of practice.
Existing Approaches - Optional
Additional context, comments, or links - Optional
A requirement of the DOI's "Application Profile of DCAT-US 1.1" metadata specification.