Closed m-burgoyne closed 4 years ago
This goes counter to what we've been working on in #12 and in the sprint. I feel strongly that EDR Items should be compatible with OGC-API Features. (Note WFS terminology is only applicable to the older OWS services)
I would rather say that the EDR geoJSON Items schema have a uri
meant to contain a unique identifier for an EDR item. That would allow queries like: /collection/{collectionid}/items?uri=http://feature_identifier
@dblodgett-usgs If Items in EDR API is API-Features compatible, why not use the Features API to retrieve the items? Then we leave the EDR API to be a coordinate based query API.
That's what I'm proposing. But the items would be EDR-features -- which extends API Features schema to be GeoJSON with particular properties that can be used as hypermedia.
@dblodgett-usgs @m-burgoyne @tomkralidis Then I think your proposal should be an extension of API-Features. I see no reason to complicate the EDR, and perhaps confuse the users, by having EDR-Features. I agree that we need a name for the Point, Timeseries, Trajectory, Polygon, ..., components of the EDR API, but these are Discrete Sampling Geometries, or 'shapes' or 'data patterns', retrieving from a relatively persistent and dense (not sparse) data store. I think the name 'features' is too confusing.
I would rather keep the scope tight and deliver quickly for others to explore and experiment with.
It would be a real shame to lose site-based environmental data from the scope of EDR. This is going against my understanding of our intended scope here, so let me make sure I follow the proposal.
We are saying we might have urls like:
/collections/col-1234/identifier/
Which would return available sampling geometries if they are identifiable.
Rather than:
/collections/col-1234/items/
Which would return the same as above.
I feel like introducing a new API pattern, /identifier
, for sampling features adds complexity in that we are introducing a new way of handling collections of features.
If you want to simplify EDR even further and say that these sampling geometries are not identifiable beyond the scope of a given API call then dropping discussion of items
or identifier
all together would be appropriate
On the other hand, going by Jeff Yutzler's point that the type of thing that comes back from a given API URI key (collections was the subject of what I'm referring to) should be consistent, we are proposing something slightly different from API-Features and having different API URI patterns would make that clear.
So what's the logic / design pattern you guys want to use here @m-burgoyne and @chris-little ? I can get on board with switching to identifier rather than items but we need to be clear what the reasoning is and make sure we can justify adding diversity for those who would rather see closer alignment of the API pattern.
p.s. @m-burgoyne shouldn't it actually be identifers
? A given EDR collection will potentially have many pre-defined identifiers.
@dblodgett-usgs I agree it should be identifers
but I don't think I am doing a good job of explaining the idea and the difference from the WFS items
query. The identifier
would be a label for a location, that label could be a name (i.e. London) or a station id (i.e. 03772) or even something like a GeoHash but it is not indended as a reference to an individual feature but as a shorthand to the geospatial information required for the query. The query would still allow the requester to subset the data available from the underlying store by time, parameters etc but the the geospatial part of the query is already predefined by the list of available identifers.
Before we go any further, can we please be more precise here? WFS does not have an items
query. OGC-API Features does. 🤓
I'm fine with this distinction. Station ids, in the sense of monitoring locations, almost always represent a complex system of sensors and other field activities so treating them as simple features tends to miss important details that people inevitably need. Opening up the potential to handle that complexity outside the scope of OGC-API Features sounds good to me. We can still have feature collections as part of a given OGC-API instance but associate the feature collections to identifiers
in the EDR side.
Leading a future discussion a bit -- I think we are very likely going to have to reckon with whether or not EDR datasets are collections
or something different, but we'll leave that for after the OAB has had their say.
@m-burgoyne - If the identifier is a label for a location, shouldn't it be /collections/col-1234/location/{locationId}
, e.g., /collections/col-1234/location/london
?
@cportele - It depends on the decision for the suggestion to rename the query end points. I agree that there could be better descriptions location-id
or location-identifier
for instance.
I was trying to make the same point about sites here: https://github.com/opengeospatial/Environmental-Data-Retrieval-API/issues/20#issuecomment-594146963 - they are emphatically not just coordinate locations, they are usually complex features with real identity.
Thanks @dr-shorthair I didn't appreciate the nuance earlier.
I like plain location
because it is parallel to point, line, polygon, etc.
@m-burgoyne I see you got this started in #32 but did not remove the "item" query type. Are you still working on that stuff or would you like some help moving this stuff along?
@jkreft-usgs and I worked up a draft JSON-Schema for the spatial representation of these locations here: https://github.com/opengeospatial/EDR-API-Sprint/issues/12 and here https://jkreft-usgs.github.io/edr_site_based/ that should be ready to be incorporated into what you've been workin up. Happy to hack at that unless you have the docs open and want to keep going as the primary editor.
@dblodgett-usgs, I think /identifier
could be complementary to /items
so I haven't removed it from the API definition. I intend to put in a fresh pull request to fix an error in the trajectory definition so I could add in the JSON-Schema code at the same time.
Interesting -- based on @chris-little's stance above,
I see no reason to complicate the EDR, and perhaps confuse the users, by having EDR-Features.
I would have expected us to let items
be an API-Features concept rooted in feature-collections which have items and EDR to define locations
in the sense of monitoring / sampling locations.
That's what I'm proposing. But the items would be EDR-features -- which extends API Features schema to be GeoJSON with particular properties that can be used as hypermedia.
Is the idea here that the very same service could be OGC API Features (Core) conformant and EDR conformant at the same time?
@tervo I would say that having EDR be an extension of OGCAPI-Features would be ideal, and could drive adoption. The ecosystem of tools to interact with OGC API Features is rapidly growing. We should be doing our best to fit within that paradigm.
Yep -- some parts of EDR could be extensions of things in Features such that a single OGC API could present the same dataset represented as a collection(s) of features or via EDR query patterns.
I would say that having EDR be an extension of OGCAPI-Features would be ideal, and could drive adoption. The ecosystem of tools to interact with OGC API Features is rapidly growing.
Yes. I very much agree.
Coming back to original question here. Having a different terminology with OGC API Features looks more confusing to me. I see environmental items as items whether they are sampled from a data cube or not.
It's more to do with the pay load than the API path. If a client hits an .../items
end point, it expects to get back something that is like the .../items
it got elsewhere. In EDR, our sampling features are a little more complicated than simple-features/geojson -- se we could put them behind a different path (/locations or /identifier) such that clients have an easier time figuring out what's what.
Note that this would not preclude implementation of API Features conformant views of the sampling features, but they would first and foremost be just that, API Features conformant views of the features.
+1 to keep as close to OGC API - Features as possible. This will help greatly drive adoption and reuse of code (just like OGC API - Common is doing the same for /
, /conformance
, and /collections
).
@tomkralidis do you have a stance on whether .../collections
should always be feature collections and whether .../items
should always return (be capable of anyways) simple features?
Using the /items
endpoint by overloading it with sampling semantics could cause as much confusion as it helps through adoption and reuse.
@dblodgett-usgs /collections
has an itemType
property that we could use to delineate accordingly. IMHO for /collections/{collectionId}/items
, as long as we can communicate the model (JSON schema in OpenAPI document, then we can safely (enough) say what we are serving to the client?
Yeah -- I see that potential for sure.
I think the issue is that it's a bit of a slippery slope in the big picture. The more complexity (like flexible typing) we put into these end points the less adoptable they become. Some is needed because people expect. Too much becomes a deal breaker.
On this issue in particular, I think we've basically determined that:
At a minimum, there should be an EDR best practice to include any desired sampling geometries in one or more API-Features conformant collections.
In #38 I've opened the discussion of whether the EDR spec should even touch the collections
topic or leave that to features and extensions of features. That's a related issue, but since "locations" is already in the draft spec, I think this particular issue is tapped out.
@m-burgoyne do you have more to explore here? A lot of moving parts right now.
@m-burgoyne @dblodgett-usgs @chris-little As I understand, the statement, "using a location identifier to select data rather than a coordinate definition. ", can be rephrased as using a feature rather than the specific required parameters, i.e. crs and coords. The feature can be represented using a named place or feature id predefined in a feature store. In this regard, It is about the features of interests like /collections/typhoons/items/dianmu in the WHU case, serving a replacement to crs/coords, to be used for the rain data retrieval. @dr-shorthair Features of interests (like in O&M) can be an optional parameter in point, polygon, and trajectory. In other words, the geometric part of the samplingobjects (point, polygon, and trajectory) can reuse existing features, which is user friendly, like using named places instead of coordinates.
Another note is about the vague yet distinct terms: EDR-feature, sampling feature, and OGC-API features. 1) I would prefer to leave items the same as the OGC API Features. I concur with the scope #20 . The definition of EDR-Feature is out of the current scope and will complicate the current API as a kind of profiles for OGC-API Features. 2) In addition, I would propose to rename the terms point and polygon to samplingpoint and samplingpolygon, to avoid the confusion with the public understanding of geometries, since the terms in EDR-API have more semantics than geometric terms.
Thanks @geopyue -- Let me see if I am following your proposal.
1) You would leave /items
out of EDR, allowing EDR compliant services to also be compatible with features but not include /items
specific conformance classes in EDR?
2) You want paths to use sample...
like: /collections/{collectionID}/samplingpoint?...
?
@dblodgett-usgs 1) Yes. I prefer to leave /items out of EDR. Items are individual resources included in a collection resource. EDR-API "retrieve various common data patterns", where we may argue patterns are items. For example, in an environmental data collection, an item is a sampling subset or data pattern determined by querytypes/samplingmethods. Currently we may agree that items are sampling features. Some may argue later that items are sampling coverages, or sampling processes (if sampling subsets can be represented using samplingmthods from a WPS prespective). 2) Yes. /collections/{collectionID}/point is a little bit hard to digest for new end users. I would prefer to see /collections/{collectionID}/samplingpoint or /collections/{collectionID}/pointsampling, e.g. /collections/{collectionID}/pointsampling?parametername=rainfall&featureofinterestes=*/collections/typhoons/items/dianmu&...
I follow now. Interesting approach -- I worry that we have trouble describing what EDR features of interest exist. Would that be left to the API-Features or an EDR-based extension of it?
Geometric conception of place and named place are two ways for place consumption. We leave the options for service vendors. The EDR vendors can choose to host a feature store accessible through OGC-API features, where interested features can be used conveniently by end users to interact with EDR. But sure the coordinate way still can be used.
What about things like available parameters and time range? How do the features relate to EDR collections?
Thanks @geopyue that was also where I was going with https://github.com/opengeospatial/Environmental-Data-Retrieval-API/issues/20#issuecomment-594146963
@dblodgett-usgs The available parameters and time range are still used as before. We only provide an option for replacing crs and coords. I agree with the comment @dr-shorthair in #20. At the implementation level, features can be transformed into crs and coords to be applied to the original EDR collections.
Here are some examples.
Feature GET
http://geos.whu.edu.cn/feature_api/collections/hainan_typhoon/items/dianmu
Response in GeoJSON
{
"type": "Feature",
"id": "dianmu",
"geometry": {
"type": "LineString",
"coordinates": [
[112.7, 21.1, 1471392000],
[112.5, 20.9, 1471402800],
[112.6, 20.5, 1471413600],
[112.7, 20.6, 1471424400],
[112.7, 20.7, 1471435200]
]
},
"properties": {
"code": 1608,
"name_en": "DIANMU",
"name_zh": "电母",
"description": "名字来源于:中国 意为:神化中的雷电之神",
"status": "stop",
"bbox": [104, 20.4, 112.7, 21.1]
}
}
EDR Request option 1 using coords:
GET
http://geos.whu.edu.cn/edr_api/collections/hainan_weather/trajectory?coords=LINESTRINGM(112.7 21.1 1471392000, 112.5 20.9 1471402800, 112.6 20.5 1471413600, 112.7 20.6 1471424400, 112.7 20.7 1471435200)&crs=EPSG:4326¶metername=rainfall&time=2019-06-01T00:00:00Z/2019-09-30T00:00:00Z
EDR Request option 2 using featureofinterests:
GET
http://geos.whu.edu.cn/edr_api/collections/hainan_weather/trajectory?foi=http://geos.whu.edu.cn/feature_api/collections/hainan_typhoon/items/dianmu¶metername=rainfall&time=2019-06-01T00:00:00Z/2019-09-30T00:00:00Z
Response in CoverageJSON:
{
"type": "Coverage",
"domain": {
"type": "Domain",
"domainType": "Trajectory",
"composite": {
"dataType": "tuple",
"coordinates": ["t", "x", "y"],
"values": [ ["2016-08-17T08:00:00Z", 112.7, 21.1], ["2016-08-17T11:00:00Z", 112.5, 20.9], ["2016-08-17T14:00:00Z", 112.6, 20.5], ["2016-08-17T17:00:00Z", 112.7, 20.6],["2016-08-17T20:00:00Z", 112.7, 20.7]]
}
},
"parameters": {
"rainfall": {
"type": "Parameter",
"id": "rainfall",
"description": {
"en": "Rainfall value"
},
"unit": {
"label": {
"en": "Mmmillimetre"
},
"symbol": {
"value": "mm"
}
},
"observedProperty": {
"id": "http://vocab.nerc.ac.uk/standard_name/rainfall/",
"label": {
"en": "Rainfall"
}
}
}
},
"ranges": {
"rainfall": {
"type": "NdArray",
"dataType": "float",
"axisNames": ["composite"],
"shape": [5],
"values": [12.2, 12, 13.3, 11.2, 8.7]
}
}
}
@geopyue the reason I suggested the /identifiers
(currently /locations
in the OpenAPI docs) endpoint was to allow the query to be structured as:
http://geos.whu.edu.cn/edr_api/collections/hainan_weather/locations/dianmu?parametername=rainfall&time=2019-06-01T00:00:00Z/2019-09-30T00:00:00Z
Where
`http://geos.whu.edu.cn/edr_api/collections/hainan_weather/locations/ would return a list of location identifiers for the collection and a description of what they are and their extents
@dblodgett-usgs I do think there is value in having a features core end point in EDR.
I saw the /items
EDR endpoint as an EDR profile of the features core specification (i.e. it has a well defined schema for the GeoJSON output), but I think it is essential that it does not extend the behaviour and functionality of features core.
OK, it seems that we are converging on keeping /items
as a way to discover existing EDR sampling geometries and keeping the behavior of the /items
endpoint we have in EDR 100% compatible with Features core. i.e. The /items
endpoint returns a feature collection of EDR features (sampling geometry metadata).
I think this means that we
items
where we would expect to get back EDR data rather than a feature collection.Agreement here?
Where I think we are not quite converging but I see a path is the issue of locations
. Let me describe what I think I see as the path and we can go from there. I'm using will and could to try and capture what I think we have agreed to and what I think we might be able to agree to.
/locations
endpoint that can only be queried by location identifier. {EDR-query-patter}/items
endpoint(s),
which would behave as the items
endpoint of any API-Features API would.item
will be accessed using a
{EDR-query-pattern}/items?identifier={item local id}
e.g. (noting this pattern applies to any EDR query type)
/position/items
would give a feature collection adhering to this schema where we might discover a feature of interest with local id position_1234
. These features would be positions available from the EDR dataset.
/position/items?identifier=position_1234
or /position/items/position_1234
would give data available from position_1234
and additional query parameters could be used to limit data returned.
In addition to the items endpoint on EDR query patterns, /locations/items
would also return a feature collection of locations. These are monitoring locations, or otherwise identifiable locations with nondescript or otherwise abstract geometry. They would necessarily have some representative geometry, but it is not a "sampling geometry" in the case of locations
.
/locations/items?identifier=loc_1234
would behave the same as the other EDR-Query-Pattern endpoints.
I think that @dblodgett-usgs's proposal seems really reasonable
Agreed at EDR API SWG 11
@m-burgoyne can you close this with a commit when the change has been applied? Or do you want help applying this change?
@dblodgett-usgs to keep this better aligned with the feature specification wouldn't it better to have the following:
/position/items
would give a feature collection adhering to schema where we might discover a feature of interest with local id point_1234. These features would be points available from the EDR dataset.
/position/items/point_1234
would give data available from point_1234 and additional query parameters could be used to limit data returned.
The same approach would apply to /area
, /cube
, /trajectory
and /corridor
I updated my comment above to use up-tp-date EDR query patterns and further describe what my intention was. I think we are in agreement? Or am I missing a nuance that is different?
The only real difference is passing the identifier_id as a path parameter rather than a query parameter.
Ahh I see -- yeah, they should both work. The path parameter is the internal ID of the features in the feature collection. As long as the feature collection has an attribute, "identifier" that is mapped onto the feature ID, then this:
/position/items?identifier=position_1234
is equivalent to:
/position/items/position_1234
Realizing my examples above were not quite right still -- will fix.
I am modifying the documents but looking at the result do we need an /items
endpoint after every query type?.
If we modify the EDR GeoJSON schema to add a value to the properties which describes the structure each of the available items in the collection (i.e. cube, trajectory, point, polygon etc) a stand alone /items
query will allow users to more easily discover all available items for a collection. Putting it after the query type forces the user to know the shape of the item before they make the query.
It would also be worth adding the output format(s) value that the item will be delivered in to the EDR GeoJSON properties
That's a good question. I think we are getting a little far afield on this issue.
Let's get the existing spec using items
as it stands and take up the additional related issues in #38 and possibly an additional issue related to /items
being available for each query type vs. one for a whole API?
Discussed at EDR API SWG number 12.
The concept behind adding /items was to to provide an approach to adding a query to support using a location identifier to select data rather than a coordinate definition. To avoid confusion with the WFS /items query where each feature has a unique item_id I suggest that this end point is renamed to /identifier and the identifier_id is a unique identifier for the location.