stream-project / WP4R

Work Package 4 (Leitung:Infai)
1 stars 0 forks source link

Requirements for DCAT endpoints (to fulfill CKAN-DCAT harvester) #24

Closed TBoonX closed 3 years ago

simeonackermann commented 3 years ago

This issue is about what the CKAN DCAT harvester extension requires from other sources to execute the sync jobs?! So the title should maybe changed into like "Requirements for DCAT endpoints (to fulfill CKAN-DCAT harvester)"

simeonackermann commented 3 years ago

Ok, I collected some requirements, those are also documented in /ckan/ckanext-dcat.

Catalog endpoint

https://{host}/catalog.{format}?[page={page}]&[modified_since={date}]

Pagination using Hydra. Example:

@prefix hydra: <http://www.w3.org/ns/hydra/core#> .

<http://example.com/catalog.ttl?page=1> a hydra:PagedCollection ;
    hydra:firstPage "http://example.com/catalog.ttl?page=1" ;
    hydra:itemsPerPage 100 ;
    hydra:lastPage "http://example.com/catalog.ttl?page=3" ;
    hydra:nextPage "http://example.com/catalog.ttl?page=2" ;
    hydra:totalItems 283 .

The modified_since parameter should passed as ISO-8601 date. Eg http://example.com/catalog.ttl?modified_since=2020-10-04

Optionally CKAN supports: profiles (EURO_DCAT, ..), queries and a filter query. I'm not sure if we also require them. May its a secondary requirement.

Example: http://example.com/catalog.xml?q=budget http://example.com/catalog.xml?fq=tags:economy

Dataset endpoint:

https://{host}/dataset/{dataset-id}.{format}

Formats

Extension Format Media Type
xml RDF/XML application/rdf+xml
ttl Turtle text/turtle
n3 Notation3 text/n3
jsonld JSON-LD application/ld+json

Content negotiation

Returning results by given HTTP Accept header. Example Accept:text/turtle. Nice feature, but also may secondary.