Closed ksonda closed 3 years ago
@webb-ben, can you please move your modifications to a fork of geopython/pygeoapi rather than of internetofwater/geoconnex.us ? This will make the PR easier
DONE
Below is a table of id/uri specification scenarios by implementation, comparing the pygeoapi master branch, the geoconnex branch, and @webb-ben 's updated geoconnex branch by html and json-ld representations and the flattened result of the jsonld in jsonld playground.
Some outstanding issues:
pygeoapi core team seems to be all-in on using the geojsonld remote context rather than inlining it, restricting structured data tests to json-ld playground. oh well
Do we want semantic properties to show up in a flattened json-ld as elements of a geojson properties blank node or be related directly to the @id
? If directly related to the @id
, then whatever @id
is must be duplicated INSIDE the properties block with an element named 'id'.
given the above, what should happen to any properties that happen to be named 'id' that are not the desired URI, whether global or the API call?
what's the desired geometry behavior? Master is still doing the blank node with gemoetry approach. Geoconnex branch drops geometry unless it's point. @webb-ben currently just drops all geometry emulates the geoconnex branch.
@dblodgett-usgs and I should write up the desired behavior we want, justify it with our use case, and submit it to pygeoapi issue.
I think what we want, is for there to be a boolean configuration option under resource:
called geojsonld:
whose default value is true
. When it is false
, we want the json-ld representation to behave in the following manner:
context:
configuration. or perhaps the default context in this situation is {"id":"@id"}
, since it does seem to be a core pygeoapi thing to create this object id
which is ported downstream to all the html templates and stuff.properties
block. All attributes should be the same level as @id
uri_field
when specified, becomes id
uri_field
is specified, but there happens to be some attribute named id
which is not the intended URI, what happens?
4.a Even if uri_field
is specified, id_field
is still specified which curently becomes id
. What if say, id_field
we want to be the attribute station_id
, but there is also some attribute named id
?any other changes we want?
What do we think of auto generating schema:geo or geosparql/WKT representations of the geometry instead of geojson geometry?
Yep -- this is the desired behavior.
No geometry unless it's point since it breaks jsonLD rules.
I think it's an error condition if there are conflicts with the id attribute.
Current:
{
"@context": [
{
"schema": "https://schema.org/",
"geojson": "https://purl.org/geojson/vocab#",
"Feature": "geojson:Feature",
"FeatureCollection": "geojson:FeatureCollection",
"Point": "geojson:Point",
"bbox": {
"@container": "@list",
"@id": "geojson:bbox"
},
"coordinates": {
"@container": "@list",
"@id": "geojson:coordinates"
},
"features": {
"@container": "@set",
"@id": "geojson:features"
},
"geometry": "geojson:geometry",
"id": "@id",
"properties": "geojson:properties",
"type": "@type"
},
{
"schema": "https://schema.org/name",
"name": "schema:name",
"description": "schema:description",
"subjectOf": {
"@id": "schema:subjectOf",
"@type": "@id"
}
}
],
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [
-85.3199473,
33.22511878
]
},
"properties": {
"fid": 1,
"id": "https://geoconnex.us/ref/gages/1029567",
"uri": "https://geoconnex.us/ref/gages/1029567",
"name": "WEHADKEE CREEK NEAR PITTMAN AL",
"description": "USGS NWIS Stream/River/Lake Site 02339210: WEHADKEE CREEK NEAR PITTMAN AL",
"subjectOf": "https://waterdata.usgs.gov/monitoring-location/02339210",
"provider": "https://waterdata.usgs.gov",
"provider_id": "02339210",
"nhdpv2_REACHCODE": "03130002000353",
"nhdpv2_REACH_measure": 9.594026996679105,
"nhdpv2_COMID": 3291414
},
"id": "https://geoconnex.us/ref/gages/1029567"
}
Desired:
{
"@context": [
{
"schema": "https://schema.org/",
"id": "@id",
"type": "@type",
"name": "schema:name",
"description": "schema:description",
"subjectOf": "schema:subjectOf"
}
],
"type": "Feature",
"fid": 1,
"id": "https://geoconnex.us/ref/gages/1029567",
"uri": "https://geoconnex.us/ref/gages/1029567",
"name": "WEHADKEE CREEK NEAR PITTMAN AL",
"description": "USGS NWIS Stream/River/Lake Site 02339210: WEHADKEE CREEK NEAR PITTMAN AL",
"subjectOf": "https://waterdata.usgs.gov/monitoring-location/02339210",
"provider": "https://waterdata.usgs.gov",
"provider_id": "02339210",
"nhdpv2_REACHCODE": "03130002000353",
"nhdpv2_REACH_measure": 9.594026996679105,
"nhdpv2_COMID": 3291414
}
in your "desired" you left out geometry altogether even though this is a point. Intentional?
I dropped it since it uses the geojson context. I'm fine leaving it in, but I'm not really sure it adds THAT much. If we were to add them back, it would be more useful to encode as a schema:geo object?
it would be more useful to encode as a schema:geo object?
agreed or geosparql/wkt? I guess for pygeoapi purposes probably schema:geo
@dblodgett-usgs I have a working version that creates your desired JSON-LD. The issue with using "id": "@id"
is that pygeoapi uses 'id'
field as its internal reference to the item. Changing the 'id'
to make the JSON-LD work results in a wonky html page (either two id fields in the properties block, or the uri as the name of the item).
I'm happy to share more details but it might be a bit too verbose for this thread.
Great! What's the next step then?
@webb-ben , you showed me two scenarios.
(1) where you are routing id
that is uri_field
to the html templates :
(2) where routing id
that is id_field
to the html templates
At first glance, (2) I think is closer to what we want
Questions:
id
from uri_field:
id
up from the bottom and rename it "URI" in the html?The canonical URL in both situations is https://geoconnex.us/ref/gages/1029567. If uri_field is not specified it would become http://[HOSTNAME]/collections/gages/items/1029567.
I will look into the order of the entries... I think we would have to use ordered dictionaries. Should be easy enough!
Is there ever a case where geojsonld will be enabled but uri_field is not specified? If they won't always be used together, what should the difference in their behaviors be? If they always will be used together do we need to declare both?
If it's not a huge issue I'd say to let them vary independently.
uri_field affects what will end up as the cannonical URL, and what is set as id
in the json-ld. geojsonld affects the format of the json-ld and html.
uri_field=uri, geojsonld=True
{
"@context": [
{
"schema": "https://schema.org/",
"id": "@id",
"type": "@type",
"name": "schema:name",
"description": "schema:description",
"subjectOf": "schema:subjectOf"
}
],
"type": "Feature",
"pygeoapi_id": 1029567,
"fid": 1,
"uri": "https://geoconnex.us/ref/gages/1029567",
"name": "WEHADKEE CREEK NEAR PITTMAN AL",
"description": "USGS NWIS Stream/River/Lake Site 02339210: WEHADKEE CREEK NEAR PITTMAN AL",
"subjectOf": "https://waterdata.usgs.gov/monitoring-location/02339210",
"provider": "https://waterdata.usgs.gov",
"provider_id": "02339210",
"nhdpv2_REACHCODE": "03130002000353",
"nhdpv2_REACH_measure": 9.594026996679105,
"nhdpv2_COMID": 3291414,
"id": "https://geoconnex.us/ref/gages/1029567"
}
uri_field=None, geojsonld=True
{
"@context": [
{
"schema": "https://schema.org/",
"id": "@id",
"type": "@type",
"name": "schema:name",
"description": "schema:description",
"subjectOf": "schema:subjectOf"
}
],
"type": "Feature",
"pygeoapi_id": 1029567,
"fid": 1,
"uri": "https://geoconnex.us/ref/gages/1029567",
"name": "WEHADKEE CREEK NEAR PITTMAN AL",
"description": "USGS NWIS Stream/River/Lake Site 02339210: WEHADKEE CREEK NEAR PITTMAN AL",
"subjectOf": "https://waterdata.usgs.gov/monitoring-location/02339210",
"provider": "https://waterdata.usgs.gov",
"provider_id": "02339210",
"nhdpv2_REACHCODE": "03130002000353",
"nhdpv2_REACH_measure": 9.594026996679105,
"nhdpv2_COMID": 3291414,
"id": "http://localhost:5000/collections/gages/items/1029567"
}
uri_field=uri, geojsonld=False
{
"@context": [
"https://geojson.org/geojson-ld/geojson-context.jsonld",
{
"schema": "https://schema.org/",
"id": "@id",
"type": "@type",
"name": "schema:name",
"description": "schema:description",
"subjectOf": "schema:subjectOf"
}
],
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [
-85.3199473,
33.22511878
]
},
"properties": {
"fid": 1,
"uri": "https://geoconnex.us/ref/gages/1029567",
"name": "WEHADKEE CREEK NEAR PITTMAN AL",
"description": "USGS NWIS Stream/River/Lake Site 02339210: WEHADKEE CREEK NEAR PITTMAN AL",
"subjectOf": "https://waterdata.usgs.gov/monitoring-location/02339210",
"provider": "https://waterdata.usgs.gov",
"provider_id": "02339210",
"nhdpv2_REACHCODE": "03130002000353",
"nhdpv2_REACH_measure": 9.594026996679105,
"nhdpv2_COMID": 3291414
},
"id": "https://geoconnex.us/ref/gages/1029567"
}
The links are removed for readability but are included in each JSON-LD response (this is easy to toggle).
right so, like this I think. (context also includes whatever was declared in config.yml)
"schema:geo": { "@type": "schema:GeoCoordinates", "schema:latitude": "33.14734284", "schema:longitude": "-85.2818902" }
Use this command to explore my working solution!
docker run -p 5000:80 -d --rm webbben/pygeoapi
The URIs only work for the /items/ page to allow easier navigation to /items/[item] from the set.
This gets us the functional json-ld we want for all four scenarios. Not sure the way schema:geo works here is pretty with the geojson coordinates staying there in this array that is hanging out alongside the "lat" and "lon" properties. It's linted the way it needs to though.
@dblodgett-usgs , due to the way pygeoapi has been built since Greg's PR, it looks like its necessary for in the json-ld for these ancilliary properties "id_" and "pygeoapi_id" to be there when geojsonld: false
to correctly route the desired uri/url to the html templates and the canonical url, due to over-reliance throughout the rest of pygeoapi on the geojson id_field:
which becomes hard-coded as "id". I don't know if geopython community will think this is a big deal. I don't think its a big deal because that stuff won't be parsed since no context, and the geojson representation is unaffected.
We'll see what others think. I'd say let's go ahead and open a PR with this?
mkay. Do you think a better reviewer to request is Tom K or Richard Law?
Probably Tom -- I'm not sure Richard has merge rights?
addressed by https://github.com/geopython/pygeoapi/pull/676
@webb-ben has been working on bringing our desired pygeoapi up to date with master here
I need to set up a demo server that deploys his emerging solution with some id/uri specification scenarios to assist with PR review (and hopefully merge)