Closed JohannesLipp closed 4 years ago
Did some research.
ids:MediaType class as a subclass of dcterms MediaType and ids:IANAMediaType as well as ids:CustomMediaType as subclasses of ids:MediaType. The latter have instances. IANA media type instances or custom types which are not part of IANA.
The list of instances is incomplete / short, as Johannes hinted here
DCAT2 also recommends to use IANA Media Types. See distribution_media_type and distribution_format
The IANA has a list of media types (see here), within 10 categories:
with 1500+ media types. ~1400 belong to the application category. The list include many well known media types (JSON, turtle, XML, MP4, ...) as well as less known types.
The IANA list is updated regularly. Therefore we would have to update our media type instances regularly too.
In my opinion, thats not a good idea. Things like languages (ISO 639) do not get many updates, but IANA types surely do. I'd suggest to use a top-down approach here: Instead of specifying all media types ourselves and maintaining a list of 1500+ instances, we should leave the filling of media types to the services. We can also keep the handful of IANA mediatype instances we currently have as examples.
Thank you for the investigation, especially on IANA media types. I fully agree that we should not maintain a list of all IANA media types, following frequent updates.
The following example you provided is a convenient way to directly use IANA media types (following the IANA media types list in an IDS context:
{
"@context": "https://w3id.org/idsa/contexts/2.1.0/context.jsonld",
"@type": "Representation",
"mediaType": {
"@type": "ids:IANAMediaType",
"@id": "ftp://this_is_some_code"
},
"@id": "https://connector.fit.fraunhofer.de"
}
@HaydarAk reopening this for a further question. Is it a good practice to use IANA media types, which are not in the list of ids:IANAMediaType
s like so?
my_namespace:APPLICATION_EXCEL
a ids:IANAMediaType ;
rdfs:label "application/vnd.ms-excel" ;
rdfs:comment "This Media Type/OID is used to identify Microsoft Excel generically"@en ;
rdfs:isDefinedBy <https://www.iana.org/assignments/media-types/application/vnd.ms-excel> ;
ids:filenameExtension "xlsx" ;
.
From a syntactic point of view, it should work.
One thing I noticed with this is the difference between IANA types and what we write in RDF/TTL or JSON-LD. We can discuss this in a larger group. Using your example above, the problem occurs when we take a look at
my_namespace:APPLICATION_EXCEL
a ids:IANAMediaType ;
rdfs:label "application/vnd.ms-excel" ;
the rdfs:label is equal to what IANA lists for excel documents. But the actual rdf statements my_namespace:APPLICATION_EXCEL
in the first line is not equal to the IANA.
Therefore it is difficult to match the actual media types.
We should either switch to a more simple appraoch in our modeling or think about how we ensure that IANA types are equal. We do not do any concrete type checking, which I dont see as part of the Information Model. But we should (at least) define, what the prefered way of modeling this should be like.
my suggestion: a) use IRIs
<https://www.iana.org/assignments/media-types/application/xml>
a ids:IANAMediaType ;
b) use a datatype property with range xsd:string, where we can write things like "application/xml"
I prefer a)
Option a) seems fair to me. This would also improve my example above, because the Excel IANA media type is no longer defined in each custom namespace.
What steps are to be executed to solve this?
Our agreed solution is to follow these DCAT-2 examples and use
dcat:mediaType <http://www.iana.org/assignments/media-types/text/csv> ;
via ids:mediaType
, respectively.
In MediaType.ttl we currently list media types such as
TEXT_PLAIN
,TEXT_XML
,APPLICATION_MSWORD
and so on.Please extend the supported media types by Excel etc., according to a standard or guideline, thanks!