opengeospatial / ogcapi-processes

https://ogcapi.ogc.org/processes
Other
46 stars 45 forks source link

Inconsistent Part 3 `sortBy` definitions #429

Open fmigneault opened 1 month ago

fmigneault commented 1 month ago

The requirement says: https://github.com/opengeospatial/ogcapi-processes/blob/b5105df0384254e02dd3d77f2ce24d32fc6f2506/extensions/workflows/requirements/input-fields-modifiers/REQ_input-fields-modifiers_sorting.adoc?plain=1#L6-L8

However, the schema shows: https://github.com/opengeospatial/ogcapi-processes/blob/b5105df0384254e02dd3d77f2ce24d32fc6f2506/openapi/schemas/processes-workflows/fieldsModifiers.yaml#L13-L16

What is expected?

{ "sortBy": "a,-b,+c" }

or

{ "sortBy": ["a", "-b", "+c"] }

or both?

The string part should reuse the ordering pattern: https://github.com/opengeospatial/ogcapi-features/blob/0c508be34aaca0d9cf5e05722276a0ee10585d61/extensions/sorting/standard/openapi/parameters/sortby.yaml#L12-L17

Also, the second part of the requirement mentions CQL2 as sorting expression. CQL2 should allow the JSON variant as well? Should the schema use https://github.com/opengeospatial/ogcapi-features/blob/master/cql2/standard/schema/cql2.json instead?

Furthermore, given this different sorting language, should a sortBy-lang parameter be considered?

jerstlouis commented 1 month ago

Also, the second part of the requirement mentions CQL2 as sorting expression.

I was considering that the expression could also be an arbitrary CQL2 expression for sorting (I suggested in Montreal that Features could also consider that capability). The +field1,-field2` are already valid CQL2 expressions.

Whether using the comma separated inside a string or the separate array elements, it could support a CQL2 expression. In this case, the pattern would not be enough for arbitrary expressions.

The advantage of the single string it is is simpler to provide simple expressions, whereas the array is the only thing that will work with CQL2-JSON, and might be clearer for complex expressions. If we support CQL2-JSON, we could do arrays for CQL2-JSON and a single comma-separated strings for CQL2-Text, or we could always make it array of CQL2-JSON object or CQL2-Text string? Or do we also want to allow for a single string with comma separated expressions for simplicity and corresponding directly to the Features filter parameter?

Regarding the cql2.json schema, for CQL2-JSON only, but I am personally much more a fan of CQL2-Text in the workflows, which is the familiar and expressive way to express things, so at least I would like to keep the CQL2-Text option, even if we decide to also allow CQL2-JSON.

Given that the language is always CQL2, and that we can distinguish string vs. object easily, I would really avoid adding a -lang (same for filter).

fmigneault commented 1 month ago

I think the comma-separated string is easier, since it won't require the processing server to convert the array into the comma-separated string when calling, eg: features. It just passes the query as is.

Adding CQL2-Text and/or CQL2-JSON makes sense only if sorting expression could be defined, which I don't think we can technically do? I mean, we could have a list of {"property": "..."} and {"op": "-|+"}, but I don't think is would be valid CQL2, and does not add much compared to the comma-separated string.

Either way, unless the receiving API supports it as well, maybe there is no point.

jerstlouis commented 1 month ago

it won't require the processing server to convert the array into the comma-separated string when calling, eg: features. It just passes the query as is.

As mentioned in the other issue, I would really expect a Processes - Part 3 supporting input/output field modifiers to be able to do processing on its own, first and foremost because not all OGC API - Features implementation support Part 3 filtering.

Even if it passes things to a remote server, I would expect the server to parse the CQL2 and validate it rather than blindly passing it to the remote collection / process.

but I don't think is would be valid CQL2

A sorting expression is just a regular extended CQL2 expression (as in CartoSym-JSON, that is not limited to boolean predicate, as for derived fields), where things are sorted in the numerical order, which the basic -field1,+field2 nicely already corresponds to. We have nothing special to do, but for CQL2-JSON we can't stuff it all in one string so we need an array of CQL2-JSON objects.

unless the receiving API supports it as well, maybe there is no point.

Passing it along to the API is an optimization. There's still plenty of value in all this on the server side in terms of allowing to take the output of a process or a collection and reshaping it to be a suitable input to a process offered by this server.

m-mohr commented 1 month ago

The format originates from STAC and was ported to OGC API - Features/Records, so it should be worthwhile to have a look at the source to keep compliance with other OGC APIs and STAC: https://github.com/stac-api-extensions/sort It seems there were some changes over time in OGC APIs that make things uncompatible again. :-( Even across OGC APIs it's different (sortBy here, sortby in Records).

Furthermore, given this different sorting language, should a sortBy-lang parameter be considered?

Sort is a simple thing. Multiple languages are evil.

pvretano commented 1 month ago

All, there is NO encoding (yet) of the sortby in JSON. Neither Records nor Features has a JSON encoding for sortby and sortby is not, and will not, be a part of CQL 2. Sort by is defined in Records right now (only as a query parmeter). It is also defined in Features (again only as query parameter) but in Features it is defined as a separate part (Part 8 I think) of the Features suite of standards. Right now Part 8 is simply a copy of what is in Records. When Features is done, a corrignedum will be added to Records to use the definition from Features.

When we get to "Part X - Search" in Features we will consider a JSON encoding for sortby in the broader context of a JSON encoding for queries ... and then probaby update Part 8 accordingly.

One last comment, I took a look at the STAC JSON encoding for sortby and is seems fine to me. Actually @m-mohr pointed out that the "defaultSortOrder" metadata field in the catalog description uses a different JSON encoding than what STAC uses so I will be updating that in Records.