opengeospatial / ogcapi-features

An open standard for querying geospatial information on the web.
https://ogcapi.ogc.org/features
Other
340 stars 85 forks source link

add support for sorting query results #157

Open tomkralidis opened 6 years ago

tomkralidis commented 6 years ago

Overview

Allow for sorting of query results against items endpoints.

In pygeoapi we are considering adopting the OGC CSW 2.0.2 implementation of sortby.

From Table 65:

List of Character String, comma separated.

Ordered list of names of metadata elements to use for sorting the response

Format of each list item is metadata_element_name:A indicating an ascending sort or metadata_ element_name:D indicating descending sort

From 10.8.4.12:

The result set may be sorted by specifying one or more metadata record elements upon
which to sort.
In KVP encoding, the SORTBY parameter is used to specify the list of sort elements.
The value for the SORTBY parameter is a comma-separated list of metadata record
element names upon which to sort the result set. The format for each element in the list
shall be either element name:A indicating that the element values should be sorted in
ascending order or element name:D indicating that the element values should be sorted in
descending order.

Examples:

pvretano commented 6 years ago

@tomkralidis You should probably also review the sorting clause in the Filter encoding specification (http://docs.opengeospatial.org/is/09-026r2/09-026r2.html and https://portal.opengeospatial.org/files/?artifact_id=66226). I wrote both the CSW and FES sections on sorting so they should be fairly similar.

tomkralidis commented 6 years ago

Thanks @pvretano. Looks like the main difference in HTTP GET context is that CSW's sort uses a colon to seperate the sort property from the sort order, whereas WFS uses a space.

I'd vote for the colon (or anything not a space).

pvretano commented 6 years ago

@tomkralidis sure ... sounds OK to me.

aaime commented 6 years ago

Say one implements an INSPIRE app-schema or CityGML, and wants to sort on an attribute. That would require an xpath which uses colon. This is a problem IMHO, either consider a char escaping approach or quoting... This is needed whatever char is used for seoaratio, but maybe best use one very unlikely to appear in attribute references.

rcoup commented 6 years ago

I presume there's a reason, but why comma-separate attributes when repeating querystring/post parameters is perfectly find under HTTP?

Another concept I've seen a few places (Django uses):

Pretty unlikely attributes will start with -, especially given GML uses them as element names, and they're illegal for XML.

nmtoken commented 6 years ago

just a nit pick, but in KVP, the & closes the pair, not opens it so:

jampukka commented 6 years ago

@rcoup just have to be extra careful with repeating parameters (explode: true in OpenAPI terms) if the order is meaningful.

cholmes commented 6 years ago

cc @hgs-truthe01

In STAC we just added sorting - currently on dev for 0.6.0, and I believe Tim has an implementation of it. It's defined at: https://github.com/radiantearth/stac-spec/blob/dev/api-spec/extensions/sorting.fragment.yaml

We'd be happy to align with WFS3, but are going to ship this first version pretty soon.

cportele commented 5 years ago

The following is transferred from #23.


In WFS3 Core there's no way to specify the sorting order. Therefore paging is only really useful for "streaming" through the response in count-sized chunks. Access to to previous page(s) might be easier to implement in the UI application (you already had the information as there's no way to skip pages with sequential forward-only next links).

When sorting extension is added paging becomes much more meaningful. Then you can access the last page by flipping the sorting order and accessing the first page, so I'd still vote no for last link.

_Originally posted by @jampukka in https://github.com/opengeospatial/WFS_FES/issues/23#issuecomment-424255074_


Without explicit sorting, does pagination have much meaning? Pagination of an unsorted (or at least, not-explicitly-sorted) result seems like a useful way to break up a request that would otherwise be too large for client or server, but random page access to unsorted records doesn't seem like a use case for anything apart from achieving these small sequential payloads. A client cannot sort without access to the full collection.

Given that sorting is not mentioned in the Core (I think; I've only just read the specification), is there a contract that a particular page has coherent content with its linked adjacent rel pages? (For the moment assuming no race-condition changes to the underlying dataset — which itself makes pagination problematic due to the potential presence of duplicates, etc.) Hypothetically (ignoring a caching layer), if you could retrieve an entire collection in a single request, must the result be identically sorted each time?

Sorting features of a collection by time (if temporal) and then secondarily by primary key seems like a suitable implementation for some servers and datasets, but obviously primary keys are not necessarily in any kind of semantic and/or alpha-numeric order so ideally function only as tie-breakers when sorting on some other property. Is time a default sort condition? And if so, what of features without temporal information: is it determined by their name? What of features with durations?

Being able to bypass a server's maximum limit https://github.com/opengeospatial/WFS_FES/issues/152 would be one workaround to the problem of pages omitting/duplicating features on different pages: a result is at least internally consistent at time of request. But it also has its own issues.

Another solution might be the ability to return a list of all pages upfront, thereby giving a client the opportunity to request pages in parallel rather than sequentially. This has advantages to the client beyond reducing the probability of data in a collection changing while paginating: including client-side latency, and UI advantages when aiming to render a First Previous 2 3 Current 5 6 Next Last pagination UI.

Hypothetically, pages could include some information about their range (e.g. temporal extent if relevant), which would provide additional benefits to both human and computer interaction. For large collections this is still probably not feasible, since it may not be able to efficiently compute where page boundaries lie in a sorted collection beyond previous/next relations. However, given that a collection's feature count is known at request, as is the page limit, a simple list of all pages seems feasible, though might fall into the realm of a client optimisation if pages can be reliably constructed using standard query parameters. (Which the existing spec says is not mandated.)

Pages themselves could even take on an explicit spatial property, perhaps if pages are organised (features are sorted) into some kind of tessellated grid, like DGGS—at this point perhaps a client would prefer vector tiles.

These are more rambling and obvious thoughts than coherent contributions. My point is really the same as @jampukka's:

To this I'd add:

_Originally posted by @alpha-beta-soup in https://github.com/opengeospatial/WFS_FES/issues/23#issuecomment-475110241_

tomkralidis commented 4 years ago

Note relevant work in STAC: https://github.com/radiantearth/stac-spec/pull/513 which could be of use.

cportele commented 3 years ago

Sorting is currently specified by Records, see http://docs.opengeospatial.org/DRAFTS/20-004.html#clause-sorting.

This should eventually be moved to Common. Features can simply specify support for sorting by adding a requirements class that binds the sortby parameter to the Features resource.

apfelnymous commented 1 year ago

Any news on this ? Trying so sort data in the gml file doesn't seem appropriate.

pvretano commented 1 year ago

@apfelnymous what specifically are you asking about?

If your question is about when sorting will be specified in the Features specifications then ...

Sorting is still on the roadmap to be moved from Records to Features (and eventually Common) but right now we are concentrating on finishing CQL2 and the other active Parts of the Features suite of specifications. For the time being, as @cportele mentioned above, if you have a sorting requirement then simply implement the sortBy query parameter at the /items endpoint as described in Records.

apfelnymous commented 1 year ago

@pvretano A colleague asked me about sorting my features in my files to have them represented correctly in the collections view. As that approach seemed weird to me I was looking for a way to do that at service interface level.