netwerk-digitaal-erfgoed / requirements-datasets

Requirements for datasets
https://netwerk-digitaal-erfgoed.github.io/requirements-datasets/
1 stars 0 forks source link

Add clarification on license IRIs #73

Open bencomp opened 2 years ago

bencomp commented 2 years ago

The Dataset Registry contains over 400 datasets without a license%20%3Ftype%20%3Frights%20%3Flicense%20%3FsimpleRights%0AWHERE%20%7B%0A%20%20%3Fsub%20a%20%3Ftype%20.%0A%20%20OPTIONAL%20%7B%20%3Fsub%20dc%3Arights%7Cdct%3Arights%20%3FsimpleRights.%20%7D%0A%20%20OPTIONAL%20%7B%20%3Fsub%20schema%3Alicense%7Cdct%3Alicense%20%3Flicense.%20%7D%0A%20%20OPTIONAL%20%7B%20%3Fsub%20edm%3Arights%7Cdct%3ArightsStatement%20%3Frights.%20%7D%0A%7D%20%0AGROUP%20BY%20%3Ftype%20%3Frights%20%3Flicense%20%3FsimpleRights%0A), even though a license is required. Other datasets have varying license IRIs that appear to refer to the same terms. For example, public domain 'licenses':

There is of course a difference between the CC0 waiver and the PD mark, but it would be good to have more guidance on which IRIs to use for each. I see that the SHACL propertyShape accepts literals, but this doesn't help users looking for datasets with specific licenses. The conceptual model also shows that the target of a license predicate is an IRI, so hopefully newly added licenses are always IRIs (not full license terms or acronyms).

coret commented 2 years ago

Thanks, good observation. That the validation of datasetdescriptions allows literals for license doesn't help indeed, this should be fixed in the specification and the validation.

The diversity of IRI's for specific licenses does beg for more guidance. We need to find a good resource for this. The current "for example one of the Creative Commons licenses" line doesn't directly help you find a IRI, but a license.

EnnoMeijers commented 2 years ago

Would it possible to reuse some of the work that is being done by the DALICC project? See https://github.com/dalicc

bencomp commented 2 years ago

@EnnoMeijers that looks very interesting, but I couldn't quickly find a list of CC license IRIs in their repositories. https://raw.githubusercontent.com/dalicc/dalicc/main/licensedata/licenselibrary/licenselibrary.ttl refers to yet different IRIs (e.g. <https://creativecommons.org/licenses/by-nc/2.0/au/legalcode>) for the CC licenses. They do link to licenses that are for specific jurisdictions, which I haven't seen much yet in "our" datasets.

A list of suggested IRIs would be most helpful. For example, "If you want to waive rights using CC0 1.0, use ... as license IRI".