mobilityDCAT-AP / mobilityDCAT-AP

Repository of the metadata specification mobilityDCAT-AP
https://w3id.org/mobilitydcat-ap
Creative Commons Attribution 4.0 International
12 stars 5 forks source link

Class Catalogue Record: new properties with information about harvested datasets #57

Open peterlubrich opened 1 month ago

peterlubrich commented 1 month ago

I am proposing two new optional properties for class Catalogue Record.

These are used for metadata records that are harvested from other portals. Harvesting, i.e., the automated import of metadata information from external portals, seems to be a common case for NAPs. For example, the German NAP is harvesting metadata information from the data portal of the German Railway Authority. With these properties, a data user can see, if the metadata was harvested, and from where.

1. mobilitydcatap:originatedFromPortal Range: rdfs:Literal Usage note: the name of the original portal, where the dataset is harvested from Example: image

2. mobilitydcatap:originatedFromURI Range: rdf:resource Usage note: the URI of the harvested metadata at the original portal Example: image

marioscrock commented 1 month ago

I think this suggestion is relevant for several use cases and I vote for defining in mobilityDCAT-AP a recommendation of how to encode this information. I would evaluate the possibility of reusing dct:source as a property directly connected to the dcat:Dataset instance. This enables the reuse of all other properties defined for dcat:Resource, e.g., accessURL, downloadURL, etc. Regarding the originating portal we can use dcat:contactPoint o dct:publisher.

For example:

ex:MyDataset a dcat:Dataset ;
    dct:source [
        a dcat:Resource ;
        dcat:accessURL <..:>;
        dct:publisher "My Portal" .
]

To be checked if DCAT-AP already provides a recommendation for harvested data sources.

What do you think?

peterlubrich commented 1 week ago

Mario, I agree to re-use an existing property for this purpose, instead of inventing new "mobilitdcatap:" properties. This property should be the right fit: Property "source metadata" under class Catalogue Record. This property was also used in DCAT-AP v2, but we "removed" it in our extension. This way, we "re-remove" it and use it for the mentioned harvesting use case.

Implications: 1. We propose to add "source data" as optional property for class Catalogue Record in mobilityDCAT-AP v1.1. Proposed usage note: "This property is used for metadata records that are harvested from other portals. Harvesting, i.e., the automated import of metadata information from external portals, seems to be a common case for some NAPs. With these property, a data user can see, if the metadata was harvested, and from where. The property SHOULD link to a public URL of the dataset descripton on the original data portal, from where the metadata was harvested. Example:

  <dcat:CatalogRecord rdf:about="https://mobilithek.info/offers/-1341139744908285765#catalogueRecord">
        <dct:source rdf:resource="https://registry.gdi-de.org/id/de.bund.eba/1bae174f-98ae-4c29-9410-17f7c73f4fcb"/>
  </dcat:CatalogRecord>

2. I give up my proposal for new "mobilitdcatap:" properties (no 1. and 2. from above)