SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
72 stars 24 forks source link

Definition of the property “access rights” (Dataservice) #302

Open oystein-asnes opened 9 months ago

oystein-asnes commented 9 months ago

Current definition of the property “access rights is "Information regarding access or restrictions based on privacy, security, or other policies."

This does not correspond 100% with the controlled vocabulary that chapter 10.2 says appllies for this property, namely the Access Rights Named Authority List.

Proposal: Change definition to “Information that indicates whether the Dataservice is public, has access restrictions or is not public.” see also #301

Disclaimer: Access rights statements on both Dataservices and Datasets may introduce a risk of inconsistency (e.g. the dataservice is public and the dataset it serves is restricted). A solution here could be to drop the requirement for using Access Rights Named Authority List on Dataservice and encourage technical restrictions here (like "API-key required" or "Load limitations apply")

bertvannuffelen commented 9 months ago

@oystein-asnes I agree that the interpretation of access_rights for Data Services is different than for Datasets.

Strongly connected, but different.

For Datasets (the original setting) this is bound to the legislative interpretation of PSI "Open Data" policy. For Data Services, it is more a technical interpretation: are there any substantial restrictions to use this service.

To ease the publishers effort, I would avoid a fine-grained listing what kind of restrictions that are present, but would try to apply relatively simple distinction (in a similar line as the current NAL).

PUBLIC = any imposed technical requirement is not blocking access. It means no human assessment, no complex checking in knowledge bases whether the requester should be blocked from access. So a simple request for authentication tokens that is fully automated, to ensure that a user of the data service will "well-behave" when using the data service is acceptable in this case. (e.g. the limitations that apply to public wifi access-point).

NON_PUBLIC = access is only granted after a (formal) acceptance assessment. Typically it includes contractual agreements.

Personally I do not know if there is need for a value in the middle. With a little interpretation freedom, this interpretation matches with the values in the existing codelist.

You see that I constrain it semantics of the codes to pure "access" and not to the broader SLA concepts like rate limitations, payload sizes, etc. All these elements should be documented somewhere. But I would not include that in the codelist.

oystein-asnes commented 9 months ago

I like it, and agree to keep it simple. My only worry is that NON-PUBLIC in the Dataset world implies "contain sensitive or personal information". Use BY_APPLICATION?

kuldaraas commented 8 months ago

Widening the issue here, @bertvannuffelen, feel free to open a new ticket if relevant.

  1. The intended use of the element "access rights" for the dataset and data service is confusing. Following current DCAT-AP descriptions we have concluded, that while at Dataset level the use of the OP rights vocabulary is intended, then at the Data Service level "access rights" is instead a narrative description or an URL which includes additional details (i.e. similar to the "rights" element for Distribution. Can you please clarify the intended use?
  2. The HVD profile presents the element "rights" (0..*) which seems to be a typo, as it a) is the wrong name (rights vs access rights) and allows repeatability which goes against the root DCAT-AP definition.
jakubklimek commented 7 months ago

I like both approaches. I like the simple distinction PUBLIC/NON_PUBLIC to know, whether I as a user can somehow easily access the service. This could be the accessRights for data service.

But I see also value in specifying the commonly appearing restrictions - API key required, Rate limitation, Geofencing applied - and this could become a new property and a new codelist. I see a use case both for seeing what common limitations are on a specific data service, and for searching for data services e.g. with no API key required, i.e. usable directly, automatically.