hvdsomp / signposting

7 stars 0 forks source link

Add "license" links to the FAIR Signposting Profile? #11

Closed hvdsomp closed 3 years ago

hvdsomp commented 3 years ago

Should license links be introduced? Optional? At Level 1? At Level 2? This desire to add this capability came up in discussions regarding the Next Generation Repositories work from the Confederation of Open Access Repositories.

Note that the approach that is currently taken would allow to express different licenses for the scholarly object as a whole and for individual content resources. That approach is already taken for other typed links too, e.g. cite-as, describedby, author . For example, regarding cite-as links pertaining to content resources, the doc says:

Provide a cite-as link only if the content resource has a persistent identifier that is distinct from the persistent identifier of the scholarly object as a whole.

Meaning, if no cite-as link is provided, the content resource inherits its PID from the scholarly object as a whole. But if a cite-as link is provided, it has to be distinct from that of the scholarly object as a whole. The same approach could be applied to license links.

martinklein0815 commented 3 years ago

+1 to license, meets our use case at a large UG gov laboratory and I imagine others are in the same boat. +1 to license at both levels, for the sake of consistency and potentially different licenses for landing page and content resource

kitchenprinzessin3880 commented 3 years ago

the use of standard uri of a license should be encouraged (if it fits the scope of the dataset) so that machine can understand it. https://spdx.org/licenses/ lists common licenses.

martinklein0815 commented 3 years ago

the use of standard uri of a license should be encouraged (if it fits the scope of the dataset) so that machine can understand it. https://spdx.org/licenses/ lists common licenses.

+1 Thx for the pointer!

SPDX's version, e.g., https://spdx.org/licenses/CC-BY-ND-4.0.html includes a link to https://creativecommons.org/licenses/by-nd/4.0/legalcode with rel="rdfs:seeAlso, which is helpful for machines. Do you have an idea whether its possible to also include the "cite-as" link header in the HTTP response? IMO, would further support machines to understand what is going on.

Could look like:

$ curl -I https://spdx.org/licenses/CC-BY-ND-4.0.html HTTP/2 200 content-type: text/html content-length: 54945 date: Fri, 16 Oct 2020 16:14:33 GMT last-modified: Mon, 03 Aug 2020 22:09:08 GMT x-amz-version-id: null etag: "717a8b8f28d12aa0efeddd8fd3f3a938" server: AmazonS3 Link: https://creativecommons.org/licenses/by-nd/4.0/legalcode ; rel="cite-as"

hvdsomp commented 3 years ago

I didn’t know about the SPDX license registry. It’s very neat in that it brings together so many licenses. But it’s problematic that the licenses are assigned a URI with SPDX baseURL. The rdfs:seeAlso link to the actual license URIs is a nice touch but its semantics are quite weak. As @martinklein0815 indicates, the cite-as link relation type would be much more appropriate as it has the exact semantics that are needed here. See RFC 8574 for more info on the cite-as link relation type.

I wonder whether anyone knows the people that operate the SPDX registry?

kitchenprinzessin3880 commented 3 years ago

Here is source github repository of spdx : https://github.com/spdx/license-list-data Json file: https://raw.github.com/spdx/license-list-data/master/json/licenses.json

@hvdsomp @martinklein0815 we should not use uri that is not permanent. @mfenner, do you know any practices on identifying license with a stable/standard uri?

mfenner commented 3 years ago

I have started to normalize license URLs in DataCite metadata using SPDX and the rightsIdentifier property we introduced in DataCite Schema 4.1. What I do is use the SPDX identifier, e.g. cc-by-nd-4.0 (as lowercase, as that works better in URLs) and the canonical URL described in SPDX, so https://creativecommons.org/licenses/by-nd/4.0/legalcode not https://spdx.org/licenses/CC-BY-ND-4.0.html. I like SPDX as it provides a human-readable string as identifier, and it helps to normalize the license URL.

kitchenprinzessin3880 commented 3 years ago

What I do is use the SPDX identifier, e.g. cc-by-nd-4.0 (as lowercase, as that works better in URLs) and the canonical URL described in SPDX, so https://creativecommons.org/licenses/by-nd/4.0/legalcode not https://spdx.org/licenses/CC-BY-ND-4.0.html +1, *.html should be avoided. regarding the canonical url, I notice that in spdx json file one license can have more than one canonical urls (see 'seeAlso'). how can we find out authoritative/preferred url from the license provider?

hvdsomp commented 3 years ago

@kitchenprinzessin3880 It was not the intent to suggest using the SPDX license URIs. Quite to the contrary; clearly the original URIs for the licenses should be used. Having said that SPDX provides a very nice registry of licenses that could be used for discovery purposes. Hence, combining the idea of using SPDX as a registry for discovery and the desire not to use SPDX URIs of licenses, our suggestion was to make it unambiguously clear in the SPDX descriptions that another license URI should be used. Currently rdfs:seeAlso is used to convey that message to machines. The semantics of rdfs:seeAlso are very loose though and definitely don't suggest "replace this URI by another". The cite-as relation type was introduced for that exact purpose though. If used on a SPDX landing page for a license, it would explicitly indicate not to use the SPDX URI but the URI that is the target of the cite-as link. Which should be the original license URI.

I notice, indeed, that there are sometimes multiple original license URIs listed under the same SPDX license URI ...

kitchenprinzessin3880 commented 3 years ago

'seeAlso' includes original URI and also URI from aggregated by https://opensource.org/ (for a full list see https://opensource.org/licenses/alphabetical).

zimeon commented 3 years ago

I agree with @martinklein0815's https://github.com/hvdsomp/signposting/issues/11#issuecomment-709438958 that license is useful and that different content resources may, in some circumstances, have different licenses so it is useful to have at both landing page and content resource levels. Also, signposting unaware clients may find a license on a particular content resource but not be able to find it via the landing page

hvdsomp commented 3 years ago

license links are now described in the spec, both at the landing page (to convey license info for the scholarly object as a whole) and at the content resource (to convey license info pertaining the content resource in case it is distinct than to of the scholarly object as a whole).