SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
74 stars 24 forks source link

dct:accessRights codelist #159

Closed bertvannuffelen closed 2 years ago

bertvannuffelen commented 3 years ago

In section 4.4.3. the usage note refers to a future activity of creating a codelist for access rights values. This has been done: see https://op.europa.eu/en/web/eu-vocabularies/at-dataset/-/resource/dataset/access-right

proposed resolution: update the usage note.

aidig commented 3 years ago

Indeed, and the controlled vobulary should be listed in Chapter 5 along with the other concept schemes: The property accessRights has the domain RightsStatement, so which property is to be used on RightsStatement to refer to the codelist for access rights types? I'm guessing dct:type?

aidig commented 3 years ago

Furthermore, the codeliste now includes the following FOUR categories.

Code Label Valid since Definition
NON_PUBLIC non-public 01-01-2013 Not publicly accessible for privacy, security or other reasons. Usage note: This category may include resources that contain sensitive or personal information.
PUBLIC public 01-01-2013 Publicly accessible by everyone. Usage note: Permissible obstacles include registration and request for API keys, as long as anyone can request such registration and/or API keys.
RESTRICTED restricted 01-01-2013 Only available under certain conditions. Usage note: This category may include resources that require payment, resources shared under non-disclosure agreements, resources for which the publisher or owner has not yet decided if they can be publicly released.
SENSITIVE sensitive 18-03-2020 Sensitive non-classified (SNC) information, information whose unauthorised disclosure could cause damage to the Commission or other interested parties such as businesses, companies, intellectual property or personal data but which is not EU classified information.

The 4th 'SENSITIVE' is new (added this year) and seems to be scoped for the Commission as well as somewhat overlapping?

init-dcat-ap-de commented 3 years ago

Especially with the cardinallity of 0..1, SENSITIVE is a problem. Data can be restricted and sensitive... Looks more like a classification level.

andrea-perego commented 3 years ago

@aidig said:

[...] The property accessRights has the domain RightsStatement, so which property is to be used on RightsStatement to refer to the codelist for access rights types? I'm guessing dct:type?

I think the range restriction (which has been actually relaxed in the new version of DCTERMS) should not prevent the use of the access rights URIs as objects of dct:accessRights (as done already for the language, file type, etc., NALs):

ex:dataset a dcat:Dataset ;
  dct:accessRights <http://publications.europa.eu/resource/dataset/access-right/PUBLIC> ;
.
aidig commented 3 years ago

Thanks for the info Andrea. It does indeed seem like a simpler way to go about it.

However, is this alternative method reflected in DCAT or DCAT-AP? In the DCAT and DCAT-AP human readable specification, the range (dcam:rangeIncludes is not used) for dct:accessRights is dct:RightsStatement and in the machine readable SHACL implementation of DCAT-AP the absence of the class dct:RightsStatement results in a violation.

  sh:property [
      sh:path dct:accessRights ;
      sh:class dct:RightsStatement ;
      sh:maxCount 1 ;
            sh:severity sh:Violation ;
    ] ;

Maybe I've missed something? Please advise.

bertvannuffelen commented 3 years ago

There are 3 issues here in the discussion:

issue a) The available values in the codelist. The codelist contains concepts which might not be useful nor applicable in the context of DCAT-AP. Since this codelist is maintained independently from DCAT-AP, there is no guarantee that the list will contain only values applicable to DCAT-AP.

proposed resolution: Unless the whole codelist has become unusable for the property, ignore the existence of inappropriate values. Leaving it up to the implementations to restrict the possible values in their catalogues. All other approaches have as side-effect one should encode in the specification the exact list of allowed values, and that is from the perspective of the specification hard to maintain.

issue b) The range restriction as part of the SHACL specification. As long this range restriction is present in the specification the SHACL representation will have it too. As discussed during the webinar on SHACL a way to mitigate the enforcement of that statement in a DCAT-AP catalogue is to separate these constraints from the others and put them in another file.

Whether or not this restriction should be validated is actually the outcome of a complex discussion on where the restriction is being enforced: at the logical level or at the materialized data level. The specification is an expression at the logical level; SHACL is a specification for the data level. There are usage situations where both coincide, but also situations where both disagree.

proposed resolution: Realizing the outcome of the SHACL webinar to split the SHACL expressions in different files. See draft specification 2.1.0

issue c) The alignment with DCTERMS 2020. In DCTERMS 2020 they have removed from several properties the rdfs:range statements and replaced them with dcam:rangeIncludes indicating that the semantically intended range is still dct:RightsStatement, but that a machine should not infer/enforce it by importing dcterms. It is then up to the implementation to make that inference/enforcement.

proposed resolution: Anyhow the alignment with DCTERMS 2020 should be done. Preferably at the W3C DCAT level first, before incorporating this in DCAT-AP.

Note that at this moment users already might be confused because they will download the latest version without spotting this subtle distinction.

bertvannuffelen commented 2 years ago

On the alignment with DCTERMS: https://github.com/w3c/dxwg/issues/1213

bertvannuffelen commented 2 years ago

Resolution for a) has been performed as follows: