Closed bertvannuffelen closed 9 months ago
tl;dr: Here, I must (traditionally) object to restricting values of a license to a controlled vocabulary, unless it somehow addresses the case in the Czechia (and Slovakia is similar) where it is simply not enough to link to one license, as there are multiple aspects to the legal side of a distribution of an open dataset that need to be addressed separately, and there is a difference between a distribution that is freely available and one that is actually protected by copyright, but licensed using CC-BY 4.0 and similar. Our terms of use specification in RDF is a structured thing that looks like this,
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix tos: <https://data.gov.cz/slovník/podmínky-užití/> .
@prefix distro: <https://data.gov.cz/zdroj/datové-sady/00006947/2de2b4d44488c64f972f6fb0b8805122/distribuce/3429fbfa4dc95ab280c26dfd25d7cf58>
@prefix terms: <https://data.gov.cz/zdroj/datové-sady/00006947/2de2b4d44488c64f972f6fb0b8805122/distribuce/3429fbfa4dc95ab280c26dfd25d7cf58/podmínky-užití>
distro: a dcat:Distribution ;
dcterms:license terms: .
terms: a tos:Specifikace ;
tos:autorské-dílo <https://data.gov.cz/podmínky-užití/neobsahuje-autorská-díla/> ;
tos:databáze-chráněná-zvláštními-právy <https://data.gov.cz/podmínky-užití/není-chráněna-zvláštním-právem-pořizovatele-databáze/> ;
tos:databáze-jako-autorské-dílo <https://data.gov.cz/podmínky-užití/není-autorskoprávně-chráněnou-databází/> ;
tos:osobní-údaje <https://data.gov.cz/podmínky-užití/obsahuje-osobní-údaje/> .
or
distro: a dcat:Distribution ;
dcterms:license terms: .
terms: a tos:Specifikace ;
tos:autor "Český úřad zeměměřický a katastrální"@cs , "Czech Office for Surveying, Mapping and Cadastre"@en ;
tos:autor-databáze "Český úřad zeměměřický a katastrální"@cs , "Czech Office for Surveying, Mapping and Cadastre"@en ;
tos:autorské-dílo <https://creativecommons.org/licenses/by/4.0/> ;
tos:databáze-chráněná-zvláštními-právy <https://www.cuzk.cz/Predpisy/Podminky-poskytovani-prostor-dat-a-sitovych-sluzeb/Podminky-poskytovani-prostorovych-dat-CUZK.aspx> ;
tos:databáze-jako-autorské-dílo <https://creativecommons.org/licenses/by/4.0/> ;
tos:osobní-údaje <https://data.gov.cz/podmínky-užití/neobsahuje-osobní-údaje/> .
and basically addresses 4 categories of "terms of use", each one separately, and this combination can be unique for each distribution. Full explanation is available in the soon to be published deliverable of the STIRData project: STIRData-Legal.pdf.
And, legally speaking, when using data from Czechia, no matter where are you from, you need to understand the terms of use in order to be able to use the data correctly. And this is, unfortunately, governed by the national legislation, and cannot be always "technically simplified" into one CC link.
Following diagram shows, what should be considered when the dataset is being published.
@jakubklimek I think we should separate here some concerns.
The proposal consists of several steps:
An assessment:
About separation of concerns:
For me it is really fine to provide a dct:rights statement which would be linked to a permissive clause that also occurs in CC-BY 4.0.
@bertvannuffelen I agree with your assessment, I actually misread the proposal and thought that it is proposing limiting the values of dcterms:license
to a codelist, which is not the case, as it is to restrict the values of rdfs:seeAlso
triples relating the license to well-known licenses.
Still, in the Czech case, a generic rdfs:seeAlso/owl:sameAs
relationship is not specific enough, as we use different properties covering different aspects of the Czech copyright law. So a simple seeAlso
relation to, e.g., CC-BY 4.0, is simply insufficient and confusing, as it is unclear, to which copyright category it should relate (we have 3 + info about the dataset containing personal information, each needs to be addressed separately, and each can be addressed e.g. by using a different CC variant.
Practically, it would mean that we would not be able to provide such mapping for any of the Czech datasets, except those that are completely out of scope of the copyright law. Those could be viewed as similar to CC0 from the point of view of effects on the data consumer, even though not semantically equivalent.
And I am a bit worried that e.g. data.europa.eu could build a quality metric based on whether or not a dataset license is explicitly related to a CC license or not, even though for some Czech datasets, the simple mapping simply cannot be done for legal reasons.
@jakubklimek, here we bump into the limits of DCAT-AP and legislative compliance.
With the proposal we try to get as far as possible with tools provided in DCAT-AP in supporting the legislative information requirements of the HVD.
But that does not mean that there are no other alternative ways to get to that case.
I suggest you align with CNECT and the Czech responsibles for HVD how this information is in line with the HVD.
The HVD directive uses in the Annex the formulation: "under the conditions of the Creative Commons BY 4.0 licence or any equivalent or less restrictive open licence;" This statement does not exclude the Czech case, but the Czech case is harder to validate, as it would require to assess each individual right.
About your concern whether data.europa.eu could install a quality metric which might give the wrong impression (of non-compliance). As such, so far in my understanding, DCAT-AP HVD will not be able to provide a statement of compliance. It provides only a common agreed way of reading DCAT-AP in the context of HVD, so that end-users of data.europa.eu can read the metadata provided by Italy and Poland in the context of the HVD in the same way. Some aspects like a less restrictive open licence are beyond the capabilities of DCAT-AP. So far I have not heard of any quality metric to be installed.
@bertvannuffelen I see in the DCAT-AP HVD 2.2.0 diagram and the description that owl:sameAs
on the dct:LicenseDocument
is mandatory, if the license document is not from the NAL.
Given the discussion above, I have 2 problems.
owl:sameAs
seems too strong for this occasion - I would rather see something like a type
property
1..1
multiplicity only in some cases. That is in my opinion another reason of making it only recommended, i.e. 0..1
, not optional, to avoid confusion.In the Candidate Release of DCAT-AP HVD the encoding of the support for Legal experts to assess the permissiveness compared to CC-By 4.0 has been made more open-ended. The above remarks on the owl:sameAs and additional descriptive information is present as a recommendation in the section on legal Information.
The HVD imposes quality licencing information, and in particular using an permissive open licence such as CC-BY 4.0.
proposal
For HVD the licence information shall be given by the property dct:licence with a URI value (persistent link). The URI should be dereferenceable: and thus provide machine readable (provide RDF representation) and a Human readable text.
To indicate relationship with CC-BY 4.0