In UCO, many types are identified by a string as opposed to a thing, i.e., a IRI-backed node in a graph. The advantage of the former is that a new type-by-string is easy to create when the particular type is missing from the ontology. The disadvantages are:
strings are not first-class citizens in the ontology world, as opposed to IRI-backed things,
consequently, subsequent code must be added to check whether the applied string equals the predefined type-by-string
the approach is rather error-prone on string typo's.
The advantage of the latter, i.e., type-by-IRI, are the opposite of the former type-by-string's disadvantages. At the same time, type-by-IRI has a disadvantage of its own, being that an upgrade of the available types with a new one requires knowledge of RDF(s), OWL, and/or the data model or ontology that define the other individuals being supplemented.
The purpose of this issue is to lay the foundation that is necessary to gain data/experience with users adding a type-by-IRI in UCO.
Objective / Purpose
The purpose of this issue is to lay the foundation that is necessary to gain data/experience with issues that users might run into when adding a type-by-IRI in UCO.
Requirements
Requirement 1
UCO shall have access to a SKOS-vocabulary that specifies individuals to represent each and every mime-type as defined by the IANA Media Types registry, in order to use these individuals to specify the type of a medium registered in UCO.
Requirement 2
The resulting taxonomy shall align with the standard two-tier scheme as defined by the IANA Media Type Registry:
Main tier: Media (Concept) Types, such as application, image, etc.
Secondary tier: Media Subtype, such as application/zip, image/gif, etc.
Requirement 3
The SKOS-vocabulary shall be serialised in Turtle.
Requirement 4
Loosely-coupled: Any modification to the SKOS-vocabulary shall not imply a change to the UCO-ontology.
Requirement 5
Manageability: Any modification to the IANA Media Type Registry shall effect an update to the SKOS-vocabulary, preferrably mechanically.
Requirement 6
Continuity & maintainability: Any modification to the SKOS-vocabulary shall result in a new version.
Requirement 7
Provenance: Any Media (Content) Type or Subtype added to the SKOS-vocabulary that originates from the UCO or CASE community, shall be categorised as such. This implies that for any Media (Content) Type / Media Subtype pair that exists, its provenance is maintained.
Risk / Benefit analysis
Benefits
The benefit of the stated objective is that data about, and experience from, users adding type-by-iri to the vocabulary become available. It is then possible to investigate how to improve the user acceptance and minimise their technical knowledge required for adding a new type in this way.
The benefit sof having a vocabulary about IANA Media Types available, are:
Using these concepts (individuals) as unequivocal types in UCO;
Becoming interoperable with Dublin Core users, particularly those that employ http://purl.org/dc/terms/MediaType in their graph design.
Risks
Except in relation to the semi-openess of the vocabulary, the submitter is unaware of risks associated with this change.
Consequences
The intention of theis CR is that the type-by-string design will be replaced by a type-by-IRI design. The consequences that are foreseen, are (not necessarily comprehensively) as follows:
A potential impact on the design of UCO on observable:mimeType to become an owl:ObjectProperty.
A potential breaking change (i.e., not backwards compatible) between the current version and the version implementing this CR.
Competencies demonstrated
Competency 1
CQ: What Media Types are specified in the ontology?
Result: All Media Types that are specified by IANA Media Type Registry.
Competency 2
CQ: What Media Types carry <substring> (string as ordered characters) somewhere in their name, abbreviation or description? No constraints apply for the amount of characters used in the substring; the search is agnostic for diacritical characters, i.e., an a in the substring finds ā, ă, ä and similar characters.
Result: Including their meta-data, only Media Types and Subtypes from the IANA Media Type Registry are returned, the name, abbreviation or description of which contain the substring.
Competency 3
CQ: Which Media (Content) Types carry , e.g., zip?
As security service provider, I want to reference application/tar, and I don't care whether it is a IANA media type or not. I've always said application/tar, it's been coded like that in my product for a decade, and my customers know I mean 'tape archive' when I say that.
CQ: Similar to Competencies 1, 2 and 3, however, now also allowing for Media (Content) Types and Subtypes that originate from the UCO and CASE community / tool providers.
Result: Similar to Competencies 1, 2 and 3, however, now limited to Media (Content) Types and Subtypes that originate from the UCO and CASE community / tool providers.
Competency 5
CQ: Which [Media (Content) Types | Media Subtypes] belong to [uco-something:IANAMediaType | uco-something:NonIANAMediaType]?
Result: A list of individuals of type [Media (Content) Types | Media Subtypes] that, according to specification, belong to [uco-something:IANAMediaType | uco-something:NonIANAMediaType]
Competency 6
CQ: Does <specific media (sub)type> belong to uco-something:IANAMediaType or uco-something:NonIANAMediaType?
Result: Either uco-something:IANAMediaType or uco-something:NonIANAMediaType, according to specification.
The top-level concepts are the so-called Media (Content) Types, e.g., application, image, etc.
The second tier of concepts are the so-called Media Subtype in each registry, such as application/zip, image/gif, etc.
Note that some extension media types not part of IANA are defined for various reasons, and may or may not be submitted in the future for standardization to IANA. These extensions follow the non-registration practice of [RFC 6838, Section 3.4], and all include the string [/x-uco-].
This repository's primary product is a monolithic ontology and taxonomy file, serialized in Turtle, mime.ttl.
(This repository is undergoing NIST review for release. If you are interested in providing early feedback, please contact @ajnelson-nist .)
UCO could subclass dcterms:MediaType with a new class uco-types:IANAMediaType, and a sibling uco-types:NonIANAMediaType in order to support Requirement 7 and Competency 4.
Background
In UCO, many types are identified by a string as opposed to a thing, i.e., a IRI-backed node in a graph. The advantage of the former is that a new type-by-string is easy to create when the particular type is missing from the ontology. The disadvantages are:
The advantage of the latter, i.e., type-by-IRI, are the opposite of the former type-by-string's disadvantages. At the same time, type-by-IRI has a disadvantage of its own, being that an upgrade of the available types with a new one requires knowledge of RDF(s), OWL, and/or the data model or ontology that define the other individuals being supplemented.
The purpose of this issue is to lay the foundation that is necessary to gain data/experience with users adding a type-by-IRI in UCO.
Objective / Purpose
The purpose of this issue is to lay the foundation that is necessary to gain data/experience with issues that users might run into when adding a type-by-IRI in UCO.
Requirements
Requirement 1
UCO shall have access to a SKOS-vocabulary that specifies individuals to represent each and every mime-type as defined by the IANA Media Types registry, in order to use these individuals to specify the type of a medium registered in UCO.
Requirement 2
The resulting taxonomy shall align with the standard two-tier scheme as defined by the IANA Media Type Registry:
Requirement 3
The SKOS-vocabulary shall be serialised in Turtle.
Requirement 4
Loosely-coupled: Any modification to the SKOS-vocabulary shall not imply a change to the UCO-ontology.
Requirement 5
Manageability: Any modification to the IANA Media Type Registry shall effect an update to the SKOS-vocabulary, preferrably mechanically.
Requirement 6
Continuity & maintainability: Any modification to the SKOS-vocabulary shall result in a new version.
Requirement 7
Provenance: Any Media (Content) Type or Subtype added to the SKOS-vocabulary that originates from the UCO or CASE community, shall be categorised as such. This implies that for any Media (Content) Type / Media Subtype pair that exists, its provenance is maintained.
Risk / Benefit analysis
Benefits
The benefit of the stated objective is that data about, and experience from, users adding type-by-iri to the vocabulary become available. It is then possible to investigate how to improve the user acceptance and minimise their technical knowledge required for adding a new type in this way.
The benefit sof having a vocabulary about IANA Media Types available, are:
http://purl.org/dc/terms/MediaType
in their graph design.Risks
Except in relation to the semi-openess of the vocabulary, the submitter is unaware of risks associated with this change.
Consequences
The intention of theis CR is that the type-by-string design will be replaced by a type-by-IRI design. The consequences that are foreseen, are (not necessarily comprehensively) as follows:
observable:mimeType
to become anowl:ObjectProperty
.Competencies demonstrated
Competency 1
Competency 2
<substring>
(string as ordered characters) somewhere in their name, abbreviation or description? No constraints apply for the amount of characters used in the substring; the search is agnostic for diacritical characters, i.e., ana
in the substring findsā
,ă
,ä
and similar characters.Competency 3
zip
?Competency 4
As security service provider, I want to reference
application/tar
, and I don't care whether it is a IANA media type or not. I've always said application/tar, it's been coded like that in my product for a decade, and my customers know I mean 'tape archive' when I say that.Competency 5
uco-something:IANAMediaType
|uco-something:NonIANAMediaType
]?uco-something:IANAMediaType
|uco-something:NonIANAMediaType
]Competency 6
uco-something:IANAMediaType
oruco-something:NonIANAMediaType
?uco-something:IANAMediaType
oruco-something:NonIANAMediaType
, according to specification.Solution suggestion
The taxonomy converts the IANA Media Types registry into SKOS under a UCO namespace, following a mostly two-tier skos:ConceptScheme:
Note that some extension media types not part of IANA are defined for various reasons, and may or may not be submitted in the future for standardization to IANA. These extensions follow the non-registration practice of [RFC 6838, Section 3.4], and all include the string [/x-uco-].
This repository's primary product is a monolithic ontology and taxonomy file, serialized in Turtle, mime.ttl. (This repository is undergoing NIST review for release. If you are interested in providing early feedback, please contact @ajnelson-nist .)
UCO could subclass
dcterms:MediaType
with a new classuco-types:IANAMediaType
, and a siblinguco-types:NonIANAMediaType
in order to support Requirement 7 and Competency 4.Coordination