Open jze opened 1 year ago
Maybe a newbie question. Why is it not required to explicitly say the the ref in the URI is point to a dct:MediaType class?
if mimetype:
mimetype_ref = URIRef(mimetype)
g.add((mimetype_ref, RDF.type, DCT.MediaType))
g.add((distribution, DCAT.mediaType, mimetype_ref))
I would like to add another comment concerning the same issue with the dcat:mediaType value. As from the DCAT-AP spec both dct:format
and dcat:mediaType
are dct:MediaType
.
In this sense, if you consider using the full URI of IANA, that is for example https://www.iana.org/assignments/media-types/application/ld+json
or the URI of the data.europa.eu
vocabulary as suggested by the European Data Portal Metadata Quality Assessment Methodology, CKAN is not showing the previsualization.
Find below 2 examples of what I mean. It is not just the previsualization but the way the Dataset is later on serialized.
JSON_LD
and mediaType as the short IANA definition application/ld+json
. You can see the previsualization.In this case the serialization of the properties of the Dataset results in:
"dct:format": "JSON_LD"
"dcat:mediaType": "application/ld+json"
http://publications.europa.eu/resource/authority/file-type/JSON_LD
and mediaType as https://www.iana.org/assignments/media-types/application/ld+json
. You cannot see the previsualization.In this case the serialization of the properties of the Dataset as JSON-LD results in:
"dct:format": {
"@id": "http://publications.europa.eu/resource/authority/file-type/JSON_LD"
},
"dcat:mediaType": {
"@id": "https://www.iana.org/assignments/media-types/application/ld+json"
}
As you can see the first one is not fully compliant with DCAT-AP but CKAN behaves as expected. The second is just the other way round, complaint but CKAN is not working as expected.
In this sense, I don't know if it will be sensible to modify the dcat
extension, mainly in the profiles definition, to check if the values of format
and mediaType
are URI references or just values. In case they are URIs we just left it untouched, but in case they aren't the logical thing will be to search for one that "resembles" or directly prepend the IANA or Europa Vocabulary domains and paths to get the full URI.
What do you think? Should I try to work that out?
Thanks for you help and comments.
The range of
dcat:mediaType
has been tightened fromdct:MediaTypeOrExtent
todct:MediaType
as part of the revision of DCAT. https://www.w3.org/TR/vocab-dcat-2/#Property:distribution_media_typeCurrently a URI or a literal is returned. https://github.com/ckan/ckanext-dcat/blob/master/ckanext/dcat/profiles.py#L1411 Only a URI should be used at this point.
Yes, but it is serialized as a literal only if the value isn't a valid URI. This avoids resulting in an invalid serialized graph. That's more or less necessary, because the python library rdflib also creates serialized URIs with values that are an invalid URI.
The range of
dcat:mediaType
has been tightened fromdct:MediaTypeOrExtent
todct:MediaType
as part of the revision of DCAT. https://www.w3.org/TR/vocab-dcat-2/#Property:distribution_media_typeCurrently a URI or a literal is returned. https://github.com/ckan/ckanext-dcat/blob/master/ckanext/dcat/profiles.py#L1411 Only a URI should be used at this point.