geonetwork / geonetwork-microservices

GNU General Public License v2.0
13 stars 18 forks source link

OGC API Record / Output formats #29

Closed fxprunayre closed 3 years ago

fxprunayre commented 3 years ago

The main goal is to add new "simple" output formats to OGC API Records that can be built from the index document. Instead of creating mapping for all formats for all plugins (as GeoNetwork is currently doing), here we design flows to build XML or JSON based format from the index document.

Output formats configuration

OGC API records configuration allow to declare formats.

See application.yml to configure format properties:

      -
        name : schema.org    < Short name use for f URL parameter
        mimeType : application/ld+json  < Mime type use for Accept header
        responseProcessor: JsonLdResponseProcessorImpl     < Response processor to use for items search and item page
        operations:     < Scope of the format
          - items
          - item

HTML pages provides formats switcher:

image

Formats considered

More work later on:

RSS

JAXB model are generated from XSD using xcj then converters provide mapping from an IndexRecord (JSON from Elasticsearch) to other object types.

RSS (and other formats) are advertised for each collections in the OpenSearchDescription http://localhost:9901/collections/main?f=opensearch

image

By default RSS feed is sorted by record change date (it may make more sense to sort by dataset publication date instead by default ?) unless you defined a custom sort parameter.

RSS items are described by:

There is no GeoRSS information and details about records online resources like in GeoNetwork 3.

image

JSON LD

JSON-LD format is based on https://schema.org/Dataset

Output are tested with https://search.google.com/test/rich-results

Search results are exposed as DataFeed

image

And metadata record as Dataset

image

JSON-LD format improvements that can be done in future tasks (relates to ongoing work in https://github.com/geonetwork/core-geonetwork/pull/5379):

DCAT

DCAT2 format is the output to use.

Requirements:

Tested:

RDFXML & Turtle

Experimental work made using RDF4J to convert JSON-LD (schema.org) to RDF.


curl 127.0.0.1:9901/collections/$firstCollection/items/$uuid \
        -H "Accept: text/turtle" 

curl 127.0.0.1:9901/collections/$firstCollection/items/$uuid \
        -H "Accept: application/rdf+xml" 

Some properties are not parse correctly. Need more investigation.