opengeospatial / SELFIE

Second Environmental Linked Feature Interoperability Experiment
https://opengeospatial.github.io/ELFIE
14 stars 8 forks source link

Plan how to use OGC-API features as target (provider-node) to be harvested by a linked data hub-node. #21

Closed dblodgett-usgs closed 5 years ago

dblodgett-usgs commented 5 years ago

1) What information needs to be included? NIR ID, label, etc. 2) What approach is best?
json-ld included with html? What are the important details? 3) What else should be considered?

dblodgett-usgs commented 5 years ago

Adding @alpha-beta-soup and @jvanulde -- maybe @mbucknell has a perspective too?

I'm imagining that a featureCollection would have a property that houses the non-information URI. The question becomes, is there a specific property name we should recommend and how should that property name be encoded into an html view of a feature so a crawler can readily make sense of how it represents an NIR?

dblodgett-usgs commented 5 years ago

Is using: schema:subjectOf appropriate here?

alpha-beta-soup commented 5 years ago

The first time I tried out OGC API was in Quebec, so I'm still digesting how we'd fit into/around it. I will experiment some more and come back.

alpha-beta-soup commented 5 years ago

So are we imagining that an OGC API Collection is equivalent to an MIR, with links to the Collection's Items (i.e. features) as DRs? There are some consequences of that that don't make sense to me.

If we simply need a reference from the MIR to the NIR, isn't that what the @id property is, in JSON-LD parlance? Or perhaps the rdf:about? (I am still not clear on this point.)

and how should that property name be encoded into an html view of a feature so a crawler can readily make sense of how it represents an NIR

The SELFIE decision was to embed a JSON-LD representation in the head of the HTML. This means that a crawler should use the JSON-LD data (as googlebot already does); even to the point of simply requesting JSON-LD directly with an Accept header (but that is optional). The HTML representation is firstly for humans: our preference for semantic HTML in particular is to assist special robots (screen readers), to boost search engine index position, and perhaps allow us to benefit from CSS resets.

dblodgett-usgs commented 5 years ago

This isn't quite what we are thinking. The idea would be that an OGC-API Collection of meta-resources would be exposed as a geojson feature collection with feature properties that are more-or less the "preview" content from the main JSON-LD meta-resource. e.g. http://selfie.example/collections/mr-collection/items would return something like:

{
     "@context": [
       "https://opengeospatial.github.io/ELFIE/json-ld/elf.jsonld",
       "https://geojson.org/geojson-ld/geojson-context.jsonld"

    ],
  "type": "FeatureCollection",
  "features": [
    {   
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          177.399809636,
          -17.9343692308
        ]
      },
      "properties": {
        "fid": "55-6f0c24e4-f22e-4c85-918f-b3dbb068080c",
        "id": "https://lab.scinfo.org.nz/soil/id/sosa/sample/55-6f0c24e4-f22e-4c85-918f-b3dbb068080c",
        "date": "1980-07-24 00:00:00",
        "name": "SB09690",
        "type": "sosa:Sample"
      }
    }
  ]
}

Where the @id is the NIR URI rather than the MR URL. The "id" is a local id and the other properties are the same properties as in the JSON-LD. Note that this is not JSON-LD but for consistency, it seems like we should use the same JSON keys in the geojson properties. There's a potential that in JSON-LD 1.1, this could all be valid JSON-LD and GeoJSON even.

dblodgett-usgs commented 5 years ago

In fact, here this is in the JSON-LD Playground.

It's valid GeoJSON too! (paste in here. http://geojsonlint.com/)

This looks like a pretty solid potential as a way to represent our MRs with OGC-API. Going to call this issue done and we'll follow up with how to test this implementation. Note that @abhritchie has something close here: https://lab.scinfo.org.nz/soil/dataset/fiji

alpha-beta-soup commented 5 years ago

I'm just working with the pygeoapi sample datasets to make sure I'm being as consistent as possible with what already exists there. How does this look for a single item in a collection @dblodgett-usgs ? (In particular the id, which I interpret as a "local ID": it could equally just be 371 and not a link at all.) I don't really know what the best way is to shoehorn an NIR @id into this. I'm also not sure how best to handle links back to the collection of which this item is a member.

{
  "@context": [
    "https://geojson.org/geojson-ld/geojson-context.jsonld",
    "https://opengeospatial.github.io/ELFIE/json-ld/elf.jsonld",
    {
      "vocab": "https://domain.com/vocab#",
      "datetime": "https://pending.schema.org/observationDate",
      "stn_id": "vocab:stn_id",
      "value": "vocab:value"
    }
  ],
  "type": "Feature",
  "id": "http://172.19.0.2:5000/collections/obs/items/371",
  "geometry": {
    "type": "Point",
    "coordinates": [
      -75,
      45
    ]
  },
  "properties": {
    "stn_id": "35",
    "datetime": "2001-10-30T14:24:55Z",
    "value": "89.9"
  }
}

The nice thing about the @context in my pygeoapi fork is that is driven mostly by pygeoapi native config, e.g.:

datasets:
    obs:
        title: Observations
        description: Observations
        keywords:
            - observations
            - monitoring
        crs:
            - CRS84
        ### Look down
        context: # <---- look here, this is additional
            - https://opengeospatial.github.io/ELFIE/json-ld/elf.jsonld
            - vocab: https://domain.com/vocab#
              datetime: https://pending.schema.org/observationDate
              stn_id: "vocab:stn_id"
              value: "vocab:value"
        ### Look up
        links:
            - type: text/csv
              rel: canonical
              title: data
              href: https://github.com/mapserver/mapserver/blob/branch-7-0/msautotest/wxs/data/obs.csv
              hreflang: en-US
            - type: text/csv
              rel: alternate
              title: data
              href: https://raw.githubusercontent.com/mapserver/mapserver/branch-7-0/msautotest/wxs/data/obs.csv
              hreflang: en-US
        extents:
            spatial:
                bbox: [-180, -90, 180, 90]
            temporal:
                begin: 2000-10-30T18:24:39Z
                end: 2007-10-30T08:57:29Z
        provider:
            name: CSV
            data: tests/data/obs.csv
            id_field: id
            geometry:
                x_field: long
                y_field: lat

The GeoJSON JSON-LD context is added compulsorily, since this just wraps otherwise valid GeoJSON, all other context is optional.

alpha-beta-soup commented 5 years ago

Also: for a collection I will need to support pagination, if I am to achieve my goal of being perfectly consistent with existing pygeoapi/OGC Features practice. Have we considered that in S/ELFIE at all?

alpha-beta-soup commented 5 years ago

I have a PR in to pygeoapi that is relevant here. https://github.com/geopython/pygeoapi/pull/246

It makes it possible to achieve the structure mentioned in https://github.com/opengeospatial/SELFIE/issues/21#issuecomment-513029784, provided the source of the data itself (whether plain ol' CSV, PostgreSQL, etc.) returns an id property that is actually an NIR. (It will of course just defer that, not enforce "correctness").

So if anyone's interested in: trying out that fork for testing; and/or contributing to that PR (thumbs-up emojis count) that would be good.

I'm considering now how to achieve some more SELFIE goals with pygeoapi.