w3c / dxwg

Data Catalog Vocabulary (DCAT)
https://w3c.github.io/dxwg/dcat/
Other
139 stars 55 forks source link

dcat:accessURL constraint is too restrictive #1588

Open bertvannuffelen opened 4 months ago

bertvannuffelen commented 4 months ago

Hi,

In section 6.8.9 the following statement is made:

dcat:accessURL matches the property-chain dcat:accessService/dcat:endpointURL. In the RDF representation of DCAT this is axiomatized as an OWL property-chain axiom.

This is a very restrictive formulation. It means from the moment you associate a DataService with a Distribution the values of 2 properties are identical. But that is often not the case.

Suppose the following situation:

Agency A builds a API service for all geospatial data. Each layer in the API corresponds to an Dataset. E.g. The road network, the public transport lines, the busstops. For each dataset there is a file dump.

The accessURL for a the roadnetwork is http://geo.api.example.com/roadnetwork.dump, for the public transport lines it is http://geo.api.example.com/publicTransport.dump en the last is http://geo.api.example.com/busstops.dump The API is shielded with access credentials and the endpointURL is http://geo.api.example.com/api/wfs.

In that case the urls are connected but not the same value. That violates the axiom.

Lets continue the example. The DPO states that the busstops are not anymore public data but private data. Therefore the dump is still available only on request. This can be reflected by replacing the accessURL with http://agency.com/accessToBusstopsData.html. Observe that in this example the technical infrastructure of the dataset and its distribution has not changed, only the access possibility. But yet again that violates the axiom.

proposal Given the examples and that this also ties very strongly (beyond a vocabulary terminology) the values of two classes, the proposal is to remove the axiom.

jakubklimek commented 4 months ago

@bertvannuffelen In your example, you assume that the distributions describing the dump files will in addition have dcat:accessService. However, the way I see it, the three file dumps would be three distributions and the distribution with the access service would be a fourth one, with no downloadURL, just accessURL, according to the axiom.

Overall though, I agree that the axiom is too strong as it does not take into account access restrictions, where we might have a data service with its known, bud publicly inaccessible endpointURL, and the distribution may point to a publicly readable accessURL explaining how to get access - a different URL than the endpointURL. Here, the axiom would not hold.