implement schema.org in responses

tomkralidis commented 6 years ago

pvgenuchten commented 5 years ago

schema.org is relevant in the html encoding of collections/features. However it can also be added to a json representation, as a json-ld context. But consider that json-ld is currently incompatible with geojson. To support json-ld a response encoding application/ld+json should be added.

Various vocabulaires exist to annotate geometry, a suggestion was made at the final geo4web conference why not add all of them. A selection can be made from geojson-ld, geosparql and https://schema.org/geo. Last years the schema.org/geo vocabulaire was limited, but it will probably improve.

There are generally 2 options to add schema.org annotations, embed a json-ld representation of the content or add microdata to each of the html tags. The first approach seems most valid, as a json representation of the object is already available in the software.

For each collection a configuration option should be added (like available in ldproxy) to define definitions for the objects in the collection (what type of schema.org/Thing (or any other vocab) does it contain). An alternative starting point could be the vocabulaires defined for INSPIRE. Then for each Feature property a mapping should be made to a schema.org property of the Thing. This configuration can then be used to generate the ld-context.

A good place to get started with structured text in html pages is https://search.google.com/structured-data/testing-tool

pvgenuchten commented 5 years ago

For the api and collections page, consider to define the parent (api) as a schema.org/DataCatalog, and each of the collections as a schema.org/Dataset, the api will then auto-popup in https://toolbox.google.com/datasetsearch

DataCatalog

name: Api-title
description: Api description
author: Api contact
dataset
- name: collection title
- description: collection description
- distribution: dataDownload
- contentUrl: features/collections/things/items?f=gml
- encodingFormat: application/gml+xml
- ...
dataset
- name: collection title
- description: collection description
- distribution: dataDownload

alpha-beta-soup commented 5 years ago

I'm working on some of these ideas here https://github.com/spatialdaotearoa/pygeoapi/tree/ld-json as part of the SELFIE. At present I'm just creating JSON-LD representations (dynamically inserted into the HTML head, if ...?f=jsonld returns a 200 status) that are a (almost) 1:1 match for the microdata introduced in #91. If there was an acceptable JSON-LD alternative to the microdata, would pygeoapi support both or only one?

@pvgenuchten I'm intrigued by

For each collection a configuration option should be added (like available in ldproxy) to define definitions for the objects in the collection (what type of schema.org/Thing (or any other vocab) does it contain). An alternative starting point could be the vocabulaires defined for INSPIRE. Then for each Feature property a mapping should be made to a schema.org property of the Thing. This configuration can then be used to generate the ld-context.

and was wondering if you'd expand on this from an implementation perspective?

SELFIE has made some recommendations for a minimum set of vocabularies, and may have different opinions about DataCatalog and so on, but it is striving to favour schema.org as much as possible. I think attention paid to good structured data representations by default will pay off with much better SEO and interoperability for any pygeoapi instance.

alpha-beta-soup commented 4 years ago

Implemented a lot of this in #246. In particular:

Uses schema.org for the structured (meta)data about the pygeoapi instance itself, and the datasets it exposes. Uses schema:Dataset and schema:DataCatalog to get into the Google dataset search by default.
For features and feature collections, does the minimal thing using the GeoJSON vocabulary as a mandatory default vocabulary. Allows for optional specification of additional @context via the YAML configuration of each collection. In #246 there is a more complicated Postgres example, including the possibility of linking across collections at the feature level.
JSON-LD representations are embedded into each page (via a second request once the HTML is loaded; injected with a small JavaScript function).
Microdata (itemprops) retained in HTML templates.

There's more that could be done, e.g. I haven't considered describing the type of each Thing:

@pvgenuchten For each collection a configuration option should be added (like available in ldproxy) to define definitions for the objects in the collection (what type of schema.org/Thing (or any other vocab) does it contain). An alternative starting point could be the vocabulaires defined for INSPIRE. Then for each Feature property a mapping should be made to a schema.org property of the Thing. This configuration can then be used to generate the ld-context.

My motivation was really the minimum viable inclusion of JSON-LD, and I'm happy to contribute more.

geopython / pygeoapi

implement schema.org in responses #33