italia / ckanext-dcatapit

CKAN extension for the Italian Open Data Portals (DCAT_AP-IT)
GNU Affero General Public License v3.0
1 stars 1 forks source link

strange behaviour of RDF harvesting if rightsHolder is missing #1

Open giorgialodi opened 5 years ago

giorgialodi commented 5 years ago

In the case the property rightsHolder is not included at all in the source catalogue (meaning that the source catalogue is not DCAT-AP_IT compliant) and the harvesting of the source is RDF (so presumably DCAT-AP_IT compliant) the behaviour is strange. It seems to me that it assigns anyway a rightsHolder that is the organization who owns the source catalogue. In some cases this might be correct. In other cases it is not.

Example:

the following dataset http://dati.toscana.it/dataset/d478654b-9d1d-4139-8a9e-036e02fdd4a1.rdf does not have any rightsHolder property (not even specified N/A). However, once we harvest regione toscana via RDF we materialize "Regione Toscana" in the rightsHolder property. This is not correct: at the source catalogue that dataset belongs to Comune di Firenze and not to Regione Toscana

The expected behaviour is that since the source is not DCAT-AP_IT compliant (mandatory property is not present), the harvesting raises an exception and discards the dataset.

In the case the harvesting is not RDF but CKAN for DCAT-AP_IT (meaning not DCAT-AP_IT compliant) then no exception should be raised and a N/A rightsHolder property should be created.