opengeospatial / ogcapi-common

OGC API - Common provides those elements shared by most or all of the OGC API standards to ensure consistency across the family.
https://ogcapi.ogc.org/common
Other
45 stars 14 forks source link

Collection of Collections #11

Open cmheazel opened 5 years ago

cmheazel commented 5 years ago

Can a collection contain other collections? The GFM says yes, but that may be too much complexity to address in simple core.

ghobona commented 3 years ago

The 3D Data Container and Tiles API pilot demonstrated how a collection of collections could be supported. The SWG should consider the lessons from that work, in relation to OGC API - Common - Part 2: Geospatial Data. Please see Section 8.2.2 of OGC 20-029 for an example.

jerstlouis commented 3 years ago

@ghobona Except the approach of the 3DC&T is not extensible in the sense that a client that does not support hierarchy will not be able to access all collections being served by a server that implements hierarchy. The approach detailed above provides exactly the same functionality, yet is fully extensible: a client that does not support hierarchy will be able to use all collections as a flat list.

Note that the approach proposed above was also the first one tested and validated (by multiple participants) in the 3DC&T, except that slashes were used rather than colons as separators at the time (the proposal to use : rather than / is because / is not a valid character inside an OpenAPI id). This issue was also re-examined in the September ISG sprint (https://github.com/opengeospatial/OGC-ISG-Sprint-Sep-2020/issues/5).

cmheazel commented 2 years ago

Given that /collections is no longer required to reside on the landing page, the path /collections/{collection_id}/collections/{collection_id} is now allowed. Does that resolve this issue?

jerstlouis commented 2 years ago

@cmheazel no, this does not address the issue in any of the way that my proposal does, i.e. it:

pvretano commented 2 years ago

@jerstlouis I really don't like this idea of mixing path separators (/ & :) is a single path - even if it is legal-ish! Independent of the mechanics of how collections of collections are represented, however, I don't think this notion should be in Part 2 at all. Part 2 should reflect the common practice in OGC APIs at the moment which does not include this idea not has it been forged in the fires of implementation. It shouldn't, in my opinion, even be an optional conformance class in Part 2. If it is included as part of Common, it should be its own part. Do we have any idea how many OGC API SWGs are working this idea? Any implementations out there? I heard Records mentioned in this thread. I guess Coverages too? I point people to issues #170 and https://docs.opengeospatial.org/per/18-045.html#hierarchical_paths_extension for additional discussions about hierarchies.

jerstlouis commented 2 years ago

@pvretano It was determined that / is definitely problematic inside the id, which is why we switched to : (which I hope is okay... but we could use whichever character is deemed best really, that becomes reserved for this purpose, only when the service implementation conforms to the hierarchical hierarchy conformance class).

Part 2 should reflect the common practice in OGC APIs at the moment which does not include this idea not has it been forged in the fires of implementation.

I agree with that, but Part 2 is still at an early stage (Part 1 is getting finalized soon), and I have been trying to build some enthusiasm for putting such a hierarchical extension through the fires of implementation since Testbed 13, and it has actually been discussed and tested in several pilots and testbeds.

It shouldn't, in my opinion, even be an optional conformance class in Part 2. If it is included as part of Common, it should be its own part.

Of course, if it is in Part 2, it must be optional. I don't really have an issue with it being a separate part, except that it is a super simple conformance class consisting of very few requirements and permissions, e.g.:

  • The use of a colon character (:) in a collection id SHALL be used to indicate a separation of hierarchy components for a hierarchical collection
  • The response for a GET request on a parent collection resource (/collections/{collectionId}) SHALL include a collections key listing children collections (following the same requirements as collections listing at /collections)
  • The response for a GET request listing collections at /collections, as well as parent collection resource for representations intended to be directly used as a user interface (such as HTML) MAY include only immediate children collections.

With any representations (e.g. JSON) an additional query parameter (e.g. expandChildren=false) could also be used to only include immediate children when specified by hierarchy-aware clients not wishing to expand everything, without breaking hierarchy-unaware clients.

About:

Do we have any idea how many OGC API SWGs are working this idea? Any implementations out there? I heard Records mentioned in this thread. I guess Coverages too?

The nice thing is that it would automatically work for any OGC API specification using /collections/{collectionId} (Part 2 - Core conf. class). Those specifications do not need to reference it or anything, and implementations that use hierarchical collections still work as they always do if the client does not recognize them. But a client that implements support to recognize hierarchy can share the same code for any specifications using Collections.

That would mean at least: Maps, Tiles, Features, Coverages, EDR and GeoVolumes. It is the type of Common conformance class which automatically brings values without having to be depended on, as I brought up on Monday (like the Simple Query conformance class). This is one of my argument for why I think it should be in Part 2, once ironed out and gone through the fires of the OGC Innovation & Standards Programs forge, if that happens before Part 2 is at the stage where it needs to be.

Any implementations out there?

There is ours (a great example that goes beyond what separate data stores could do is the Natural Earth bathymery layers) that has been there since Testbed 14. We have matching support in our GNOSIS Cartographer client that presents these in a hierarchical manner, as well as selects based on the APIs to use the most efficient mechanism to retrieve data (e.g. multilayer tilesets vs. individual layers).

At some point, GeoServer also was using : as separator for the Vector Tiles Pilots (so it was a partial implementation of this) and our clients recognized the hierarchies the same way.

Also an early vector tiles WMTS implementation from CubeWerx :) Also in 6.2 in that ER:

by adopting a layer ID convention where the ID of a component layer is prefixed with the ID of the conglomerate layer and a colon (":"). E.g., if "Daraa" is the conglomerate layer, the client could know by convention that the layers "Daraa:AgricultureSrf" and "Daraa:CulturePnt" (also advertised by the WMTS capabilities document) are component layers of "Daraa".

In the 3D Container & Tiles Pilot it was one of the most discussed topic, though the approach was slightly different and I argued for this simple solution instead which maintains compatibility with Part 2 (and Features), which ourselves and Helyx had initially implemented (with slashes at the time).

But really what I am hoping for is that we give it a try with implementations at the upcoming sprint focusing on Common! I think it also fits nicely with the ability to qualify feature types in JSON-FG, so that you could offer both mixed feature types and individual feature types from the same API (this is also functionality we need for the Tiles API to offer both multi-layer and single layer tilesets).

I point people to issues #170 and https://docs.opengeospatial.org/per/18-045.html#hierarchical_paths_extension for additional discussions about hierarchies.

Themes is really orthogonal functionality to this. The same collections can be classified in multiple themes, whereas this proposed collection hierarchies allows for organizing a large number of collections in a hierarchical manner, as well as identify components (e.g. feature types, components in CDB, etc.).

It was also a recommendation from past sprints that we investigate this hierarchical collections capability in an upcoming sprint, and this seems like a very good time (if there is interest from participants of course). @ghobona

pvretano commented 2 years ago

@jerstlouis I realize that it is a "super simple" extension but it is not one that is in common practice at the moment. If it becomes common practice then it should be fairly easy to whip up a Part X. I think putting it into Part 2 will detract from the "core" functionality of collections which is a flat collection. As for the CubeWerx implementation, that may be an anti-example since we use to colon as a "datastore" separator which may not be exactly what you are talking about. We don't expect clients to parse or understand about a token like "Daraa:SomeFeature"; that is an internal thing that means something to our server. I'll have to consult with @pomakis about that since he did the implementation ... but if it is an example of the concept then great! The more implementations of the better ...

jerstlouis commented 2 years ago

@pvretano

We would not want to detract from the "core" functionality.

But we already have the "Simple Query" conformance class to filter /collections which I think is not "Core" functionality either.

As both conformance of these classes exist to deal as ways to manage large number of collections, perhaps that would fit nicely together in a separate Part? (perhaps even moving the Records collections extension there would make sense?). We could call it "OGC API - Common - Part X: Queryable and hierarchical collections" or something like this :) Just a thought... (Again each conformance classes there would be optional)

pvretano commented 2 years ago

@jerstlouis and you think piling on some more will make the situation better? Yes, I we put the query parameters and the collections of collections stuff into a separate part...

jerstlouis commented 2 years ago

@pvretano Well, it's just a question of how those conformance classes should be organized. I would find it useful to have a Part for conformance classes that facilitate handling a large number of collections, but that's just my own opinion :)

About parts organization, I also still wonder how CITE will handle that, will each part get its own test suite and compliance status? That is probably my biggest concern with super simple Parts (in addition to the boilerplate / overhead, and difficulty for implementers to discover and refer to all the different specification documents).

pvretano commented 2 years ago

@jerstlouis agreed - the "large number of collections handling" stuff should be in its own Part ... not Part 2 of common which should be limited to the basic /collections behaviour. I would go so far as to say that bbox, datetime and limit should not be in Part 2 but that ship may have sailed.

As for CITE, it has nothing to do with parts; it has to do with requirements and conformance classes. The Features CITE test for example (at least the beta) checks the conformance declaration and depending on what is in there, it triggers the appropriate requirements tests. So, if your server declares http://www.opengis.net/spec/ogcapi-features-2/1.0/conf/crs (i.e. Part 2 of Features) in its conformance declaration then CITE will run all the tests that check all the requirements for that conformance class.

I would be surprised if there was a specific CITE test for Common because common is something that is referenced by other specifications like Features, Records, etc. and testing common's requirements will be part of testing conformance to the containing specification. So if your server conforms to Feature core then it also conforms to common since common is a prerequisite for Features core (well not right at the moment but some later version of Features will reference Common). Put another way, your cannot get your server "certified" as compliant to common. You get your server as "certified" compliant to Feature Core or Records Core and common is one set of requirements test that your server has to pass to get that certification.

jerstlouis commented 2 years ago

@pvretano

I would go so far as to say that bbox, datetime and limit should not be in Part 2 but that ship may have sailed.

That is the Part 2 "Simple Query" conformance class that I was referring to, that I also think could go in there.

Thanks for the CITE clarifications, that makes sense. So if I understand correctly, there will be only one CITE test (testing all parts with tests implemented so far) / single certification fee to conform to Features and all its parts.

So any new "Common" part that come up, could automatically be picked up by the other specs to test and report conformance to those additional requiments (e.g. simple query at /collections, Records queries at /collections, hierarchical collections...), if they apply based on existing dependencies (e.g. dependency on Common collections).

Cool!

pvretano commented 2 years ago

@jerstlouis yikes! I am not sure about fees or anything like that. You will need to ping OGC for that information.

ghobona commented 2 years ago

We use a single Executable Test Suite (ETS) to check for compliance to conformance classes from both Part 1 and Part 2.

Compliance certification for Part 2 is charged separately from Part 1. That is, there is a separate fee.

jerstlouis commented 2 years ago

Thanks @ghobona . So @pvretano my concern for simple Parts is back :)

pvretano commented 2 years ago

@jerstlouis so you are basing your concerns on economic considerations? I think that the Parts of common should reflect what the state of common practice is at the time the specifications are written. Of course, writing specification takes time so there would be some drift but generally the common practice is clear. For Part 2 of common, for example, the use of /collections is a well established practice. The use of bbox/datetime/limit at the /collections endpoint and the "collections of collections" concept are not common practice at the moment so I would not include them in Part 2.

jerstlouis commented 2 years ago

@pvretano I agree that in order to publish Part 2 sooner while reflecting common practice, offloading the Simple Query and Hierarchical Collections conformance classes to another part makes sense.

I think the economic considerations are valid for vendors (especially smaller companies) wishing to implement and be certified for as much as possible of the OGC API specifications, if several (tens? hundreds?) parts get published for single small conformance classes.

Also, "Simple Query" (bbox/datetime/limit on /collections), hierarchical collections, and records queries at /collections, would all work with many OGC API specifications, without them having to reference those. So I also wonder how this would be handled in CITE -- they would not have a corresponding Part to be certified for.

We're probably way off-topic here (sorry) and it's something to bring up with the CITE SC, but I feel that we should take these things into consideration when deciding where common conformance classes like Hierarchical Collections and Simple Query should end up.

Hence why I suggested OGC API - Common - Part X: Queryable and hierarchical collections, which would provide a CITE Test target (independent of the spec it is used with), and would also partially address economic concerns (e.g. grouping 3 conformance classes into 1 part to be certified for, and also significantly fewer than if Features, Maps, Tiles, Coverages, EDR each had to explicitly define a new part to support Hierarchical Collections, Simple Query, or Records Search @ /collections).

cmheazel commented 2 years ago
cmheazel commented 2 years ago

@pvretano So you are saying that we cannot filter the collections to be included in a /Collections response? That I cannot say "give me the collections that fall within this bounding box?" That IS common practice, especially for imagery. Remember, an image is a coverage which is a collection.

pvretano commented 2 years ago

@cmheazel I am saying that should not be in Part 2. The "core" of the /collections endpoint should, more-or-less, match the functionality currently offered by features.

jerstlouis commented 2 years ago

@cmheazel I think it is super important to clearly distinguish between querying on "a" collection vs. querying on a "set of" collections.

The Simple Query Conformance class is useful to query the "set of" collections.

In this way, it serves a similar purpose to Records - Part 2 collections extension, and to some extent the proposed Hierarchical Collection as well, where querying a particular parent could only return that parent's immediate children to drill down to the collections of interest.

In terms of OGC API implementations so far, I agree with @pvretano in that I don't think bbox and dateTime at the /collections level has been widely implemented yet.

For EO imagery, that is an interesting topic, because on one hand it is useful to present the whole collection of images as a single coverage, but on the other it is useful to "discover" scenes of interest. For this I have been proposing the concept of a Scenes API expanding upon the Testbed-15 Images API concept, and this could be implemented either as hierarchical collections, or as a /scenes/ sub-resource of the collection (and either end-points could support the Simple Query parameters bbox, dateTime...).

cmheazel commented 2 years ago

@jerstlouis Most OGC API Implementations so far have been based on API-Features. Since Features does not support filtering on the /collections path, it is no surprise that it has not been implemented in OGC APIs. However, this is a very common feature of existing systems. Simple Features is simple, optional, and captures common existing capabilities. I'm convinced that it will be used. And if not, it does no harm.

cmheazel commented 2 years ago

@pvretano The functionality provided by API-Features addresses the needs of the Features community. That is not the universe. In my world the Simple Query functionality is ubiquitous. Part 2 is useless without it.

jerstlouis commented 2 years ago

@cmheazel

However, this is a very common feature of existing systems. Simple Features is simple, optional, and captures common existing capabilities. I'm convinced that it will be used. And if not, it does no harm.

I am in full agreement, but I would make the same argument about Hierarchical Collections :)

Part 2 is useless without it.

I would disagree with that though. Part 2 is already very useful with just the Core conformance class, in that clients and servers can implement this /collections and /collections/{collectionId} functionality that already works the same across Features, Coverages, Maps, Tiles and EDR.

But additional conformance classes like Simple Query, Hierarchical Collections and Records Queries (currently Records Part 2-collections extension) makes it more useful, especially for large number of collections.

Where those conformance classes should get defined should consider the overall implications for OGC in terms of drafting the standards, managing them, writing tests for them, the implementations getting certified for them, and procuring them, as discussed above.

jerstlouis commented 11 months ago

At the OGC API - Common session of the 127th Members Meeting in Singapore we briefly discussed this topic and there was no outspoken objection to draft and review an optional "Hierarchical collections" requirements class for Part 2 adding which would:

This would also replace capabilities that were specifically included in the 3D GeoVolumes spec ( https://github.com/opengeospatial/ogcapi-3d-geovolumes/issues/5 and https://github.com/opengeospatial/ogcapi-3d-geovolumes/issues/12 ).