Closed cportele closed 4 years ago
The idea behind the "Collections" conformance class in Common was that of an abstract data resource, which could have one or more representation. That abstract data resource provides an attachment point for other modular resources (e.g. tiles, maps) to bind to.
The general nature of the term 'collection' is the source of the confusion in #47, because we specifically are looking for a resource type identifying "geospatial data". We proposed a number of alternate terms: /geodata /data /datasets, but Features has already standardized /collections.
About the hierarchy of dataset > collection > feature, this assumes a micro-service distributing a single dataset.
Several organizations have already mentioned their interest to serve multiple data sets from a single service end-point. This was a common use case of the classic WxS services. I strongly disagree that the single dataset use case is a majority use case.
How can we resolve this? A hierarchical data conformance class would be one way...
I really like the breakdown of resources in OGC API - Common. Very well written and powerful concepts.
/ /api /conformance sections are spot on. This approach will save major time and money in architectures and implementations.
Then, collections complicate things. Perhaps we should just remove the section from Part 1 of Common and keep it simple.
Best Regards, Jeff
Could we make some of the URI path optional?
/[collections/{collectionId}/]items/{featureId}
@dr-shorthair I strongly feel that if that portion was left out of the required core 'Features' conformance classes, it would instantly solve all those problems, the arguments would stop, and the confusion would vanish!!
It would allow for the hierarchical catalog listing & search module we envision to point directly to such an 'items' resource, with no prerequesite as what comes before it, and at the same time it would make that core conformance classes compatible with a simple GeoJSON file hosted anywhere on a static bucket (assuming the /{featureID} is also a separate conformance class).
The information provided by /collection/{collectionID} would be a separate conformance class, which we would not have to use, and could be available elsewhere from an equivalent conformance class presenting that information in a generalized way which suits more than only features.
Also see discussion at #111 and https://github.com/opengeospatial/Environmental-Data-Retrieval-API/issues/24
The main point of the comment was that if Common really wants to discuss data offerings at this stage, it should start with a "dataset" concept (not "collection"), since this is the widely used concept that we need to tie to. See the Data on the Web Best Practices, DCAT, schema.org, ISO 19115, Google Dataset Search, etc.
How the dataset is organized will vary, depending on the nature of the data among other aspects.
@cportele The problem is that more often than not, a vector dataset is a single collection, or can be broken up into sub datasets made up of single collections (e.g. an individual GeoJSON or Shapefile), and we would like to handle such vector datasets just like raster ones, or 3D ones. The Features requirement to have '/collections/{collectionID}' really is problematic in that sense.
Maybe this just isn't Common. Instead of trying to force Features to fit into Common, maybe just let each OGC API define how it wants to work.
@bradh What exactly isn't common? We're trying to standardize a consistent approach to the common aspect of geospatial data, regardless of its available representation(s).
Geospatial data all have concepts of identifier, title, keywords/tags, geospatial / temporal extents, resolution / being appropriate for a particular scale, and fuller metadata profiles. You can search and filter these datasets based on that information. And geospatial data of multiple types is something that geospatial processes can act upon, and/or return, and something you can render as layers on a map. Geospatial data is also something you can spatially partition using tiles, DGGS, bounding boxes. This is all common.
If we're not doing that, we are not achieving this goal of an integrated suite of consistent specifications for the OGC API, we are heading to the same old classic services with a REST/OpenAPI twist, and personally I think the value of that is far less than achieving that particular goal (and the only reason we invested a lot of efforts into it over the last couple years).
I worry we're compromising the goal of usability while looking for architectural commonality. Features is working for people.
Anything that sounds like "Features needs to change to fit into the Common vision" should be ringing alarm bells.
@bradh WxS has been working for people for 20 years. The new OGC API shift is a chance to get some things right. The Features SWG itself expressed from the beginning a desire for a more consistent approach across the OGC API, and the concept of a set of modular building blocks was welcomed by the TC in Singapore.
The alarm bell has been ringing deafeningly loud in my ears since Features was undergoing standardization before any Common specifications was drafted, or even concepts other than "OpenAPI" agreed to.
We implemented Features in good faith. Common doesn't get to break it.
If there are really common aspects with other OGC APIs, codify them in Common. If they aren't common, don't try to make them so to fit an architectural vision.
Stop thinking all-encompassing superset, and do a common subset. Common should be a shared base class where everything that is true about Common is true about every OGC API implementation.
@bradh it's in that "we need to remain compatible with Features" mindset that Common adopted the "Collection", and its collection resource, as this abstract concept of a Geospatial Data Layer.
And Coverages also accepted to use it in that way as well.
Now some folks are finding Collections used everywhere confusing.
We want to avoid any breakage, but if minimal breakage in the form of a minor reorganization of conformance classes (e.g. splitting them and making some optional) could help harmonize with the architectural vision of common with minimal impact to existing implementations, and the Features SWG agrees to such changes for the next revision, I would imagine this should not be entirely out of the question?
We are trying to find a proper solution. Commonality being only having a landing page, conformance classes and an API definition is not a proper solution. That is not an integrated or modular OGC API.
I would be a lot happier with a small, common Common; compared to broken implementations.
I don't care if the conformance classes change. I do care if things that were standardised in Features get broken in a user-visible way. I agree architectural vision is nice to have, but I don't agree it is a valid reason to break running code.
So in my opinion, if Collections no longer makes sense for Common, don't force it in.
Enormous amounts of time and energy are being spent on 'collection confusion' with little gain in interoperability.
At this point it would be better to remove the section from Common, in my opinion.
Best Regards, Jeff
@jeffharrison I firmly believe there is an enormous amount of interoperability to be gained from consistent, modular and integrated set of OGC API specifications, working the same way with multiple types of representations.
Even if we remove the Collections section because the Collection confusion is deemed too great, Common still needs to define how modular buildings blocks work the same with different data representations, and we need to start over, and no longer have that option which was compatible with Features, so we need to figure out an alternative which works for everyone.
A consistent and modular OGC API re-using common elements is not all about OpenAPI!
maybe this can be solved by not assuming all profiling of common has to be done as depth=2 hierarchy - why not have a profile of OGC-Common - say OGC-DatasetAccess that is further profiled by Features, Coverages etc - anything that has a common notion of a dataset (testable by common identifiers of instances in practace). I'd look to align that notion with GeoDCAT and make Catalogs fit it too.
You would also create a profile of GeoDCAT to standardise how OGc-API services are described as DCAT distributions - and that could be the basis of OGC-Catalogs.
FWIW - wearing my OGC hat I will be working with other OGC staff to develop a full graph of specification inter- relationships with particular attention to profile relationships - I will publish this on the OGC Definition server using the Profiles Vocabulary ( https://www.w3.org/TR/dx-prof/ ) - and I guess I 'll develop a profile of prof to describe the set of resources available for each OGC-API profile. (its turtles all the way down, but patterns for incrementally narrowing choices to develop consistency without overspecifying up front kind of look this way)
Having worked on OGC standards since 1998-99 when we sponsored the first Web Mapping Testbed, I've noticed OGC has a slight tendency towards more complexity. OGC API is a refreshing break.
As I've said before, I like the breakdown of resources in OGC API - Common. Very well written and powerful concepts.
/ /api /conformance sections are spot on. This approach will save major time and money in architectures and implementations.
Then, collections complicate things. We should just remove the section from Part 1 of Common and keep it simple.
Then document Extensions from the 'bottom up' as they come online and are tested and proven (paraphrasing Clemens).
Best Regards, Jeff
Thinking about this some more, it may be that "Collections" is less the problem than "Collection". There is a general and worthwhile resource model pattern of referencing (and describing) sets of items such as features, then referencing members of each set. For some API's, though, "Collection" refers to a clear set of resources, while for others the structure just isn't fit well by "Collection". So the term "Collection" becomes meaningless and a client has to figure out how to request process representations of the particular resources that it refers to in a particular API.
What "Collections" really invokes is a catalog of holdings. Would almost have preferred that /collections be a metadata resource much like /api or /conformance. As a top-level metadata resource in the service, it really adds no information to the URL path for any of the specific resources being provisioned through the API, e.g.
/api /conformance /resources (describes the types and organization of provisioned resources) /resourceID-1 (which might be a homogeneous collection of features or a collection hierarchy of 3D data containers or a heterogeneous collection of sample resources. or not really a collection at all. Typenames defined in /resources) ... /resourceID-n
This can still be largely compatible with OGC API - Feature if the /resources document defines the resource structure as /Collections/{CollectionID}/.... this could even be the default in the absence of /resources , but this might be a more meaningful construction of URL's than keeping /Collections in Common.
Even if we remove the Collections section because the Collection confusion is deemed too great, Common still needs to define how modular buildings blocks work the same with different data representations, and we need to start over, and no longer have that option which was compatible with Features, so we need to figure out an alternative which works for everyone.
I reject the premise. Common doesn't need to do that. Common could just be common base class, and modularity could be built out from this small modular base. End users could use the parts they want, without needing to understand the master architecture. It works for the way we bring stuff together for the rest of the web.
@bradh as I described above, if we want some modules to behave consistently when interfacing with other modules, they need to implement some common interfaces which needs to be defined. This is what we are trying to do here in Common, much in line with what @rob-metalinkage was suggesting above.
"above" was actually here : https://github.com/opengeospatial/oapi_common/issues/99#issuecomment-600542114
Tempted to grab a copy of 'Common' and delete every instance of 'collections' then post it to Pending.
Would the geospatial world end? Probably not...
Regards, Jeff
I also vote take it out and bring it back if and when coverages and other specs find a need to use it. I found this comment by @jyutzler https://github.com/opengeospatial/oapi_common/issues/47#issuecomment-598734879 to be helpful.
Maybe I'm misconstruing it a bit, but having drastically different kinds of things (collections of features vs collections of coverages) under the same url path is just complicated and confusing. We don't need to do that.
Lots of hot air spread around on this issue but the #47 thread is worth reviewing if you haven't.
@dblodgett-usgs OGC API - Coverages are currently using Collections, in the way it is specified in Common right now, and one collection = one coverage, not a "collection of coverages". You can think of it as a collection of data cells making up one coverage.
Maybe I'm misconstruing it a bit, but having drastically different kinds of things (collections of features vs collections of coverages) under the same url path is just complicated and confusing. We don't need to do that.
I disagree in the strongest terms. See for example the Natural Earth dataset we serve at:
http://maps.ecere.com/geoapi/collections/NaturalEarth
It is simple and convenient. We are serving this one dataset made up of both raster & vector layers. Why there is such strong objections to being able to achieve such a simple and useful thing baffles me to death.
@jeffharrison This intent on trying to convince everyone that "Collections are confusing" and "We only need OpenAPI stuff in Common" is literally severely underming two years of efforts by myself and Joan and Chuck and others at trying to bring consistency and proper integration within the OGC API, and achieving consensus, even including OGC API - Coverage SWG. See issue #62 as an example of people thinking this is useful. This is what Common is about. As I said in #111, yes the term collection is somewhat confusing (at least when you first encounter it), but it was just as confusing to me even for collections of "vector features". Agreed, the Collections section probably need a major overhaul. Perhaps the whole /collections/{collectionID} in Common needs to be abandoned in favor of something else. But we still need to achieve these goals. Let's please have a constructive conversation on "how" we can do this.
Jerome,
Thank you for your post, and your efforts. Yes, there is confusion over collections, and discussions are happening in many forums.
I'm not recommending that we only need OpenAPI in Common.
I'm recommending we keep the Core of OGC API - Common simple. Then we have Extensions to the Core of OGC API - Common for different resources as they proven in implementation.
As we've discussed, the description of resources in OGC API - Common is well written and powerful.
/ /api /conformance sections are great, and the approach will save time and money in architectures and implementations.
These sections are already done and should constitute a simple Core of OGC API - Common.
Then we document Extensions from the 'bottom up' as they come online and are tested and proven. Many of these are already done as well, so this will happen quickly.
And yes, we can have a discussion on what you propose as part of OGC API - Common Extensions, maybe it should be the first one.
Best Regards, Jeff
It is dangerous to say:
is literally severely underming two years of efforts by myself and Joan and Chuck
about a review process. This (#47 , #99, #111) has clearly caught the attention of the community and requires thorough discussion given its centrality to the future of OGC APIs.
I'm going to hold off on further comment on this issue.
@dblodgett-usgs Sorry. I am just afraid the other side of the story (e.g. #62), is getting drowned and there has been repeated suggestions throughout these issues (twice in this one) to thrash the whole integrated & modular approach, without suggesting an alternative on how it could work better. It does require thorough discussion.
@jeffharrison Thank you for that post.
I'm recommending we keep the Core of OGC API - Common simple. Then we have Extensions to the Core of OGC API - Common for different resources as they proven in implementation.
I thought this was already the case. OGC API - Common (Collections) was an extension or part, or at least separate conformance classes. And OGC API - Coverages are also currently using that as well in implementations. There are even already implementations of this with 3D data in the 3D Container & Tiles pilot.
And yes, we can have a discussion on what you propose as part of OGC API - Common Extensions, maybe it should be the first one.
Great. Agreed. Isn't this what we're discussing here in this issue? What form these conformance classes for enabling modular and integrated specifications to inter-operate with geospatial data should take? Rather than suggesting to thrash the whole thing, let's try to figure out what needs to change or what alternatives could be considered.
Best regards, -Jerome
Jerome,
I'm not suggesting to 'thrash the whole thing'. Never said that.
I'm going to hold off on further comment on this issue.
Best Regards, Jeff
@jeffharrison
At this point it would be better to remove the section from Common, in my opinion.
Tempted to grab a copy of 'Common' and delete every instance of 'collections' then post it to Pending.
We should just remove the section from Part 1 of Common and keep it simple.
Sorry, I might have over-reacted if you just meant to move it to another part, and I apologize.
There is no mention of parts in the current version of 19-072. I don't recall whether the Collections section was going to be Part 2 (Collections), but I believe that was discussed at one point. But I don't think the parts organization matters that much, the conformance classes being optional. Dataset access and cataloging might be a better name for that Part 2.
Thank you.
Best regards, -Jerome
Collections is now a separate standard. It includes definitions for "collection" (from Websters), "Dataset" (from DCAT), and "Distribution" (from DCAT). This should be consistent with API-Features and extensible to other resource types. Recommend that we close this issue. Further discussion should be under a new issue based on the updated drafts.
Does the Collections standard not only define dataset, but also state clearly the relationship between the resources and a dataset? If that is not the case (e.g., one API per dataset), the issue shouldn't be closed.
@cportele -- I posed that specific question in #120 -- There's an opportunity to combine issues if that's the issue that's left.
I think we have bottomed this out over in https://github.com/opengeospatial/oapi_common/issues/140#issuecomment-637664012
However, I want to propose that this issue -- specifically @cportele's intro to it -- be used as a guide for modifications to the Part 2 specification document. Or atleast, we don't close this until we've talked through and created new issues for what needs to be changed. Specifically, I think the notion of "distribution" needs to be looked at very carefully and this issue calls attention to that effectively.
@dblodgett-usgs I think this issue is more linked to #11, which we have agreed to leave aside until the simple question of how a collection is defined, and the basic flat list of collections in a single dataset, is resolved in #140 (I am very glad we might be there now!).
I think both hierarchical datasets and hierarchical collections are desired by many. One question is whether both concepts could or should be addressed with the same re-usable building block. (e.g. I had proposed as a potential solution a "this level constitutes a dataset" marker).
There are also cases where you may want to actually distribute a feature collection or coverage which combine individual feature collections and coverages, both as individual collection, and as a meta-collection. There is a strong use case for this with multi-layer vector tiles, as Mapbox Vector Tiles are most often used, but could also applies to an overall /items (e.g. serving a GeoJSON with multiple feature collections with qualified IDs like "AgricultureSrf.1", "TransportationGround.1" differentiating the data layers, as used in multi-layer tiles in GeoServer). See also this weather example I mentioned in this sensors discussion ( https://github.com/opengeospatial/oapi_common/issues/147#issuecomment-637840464 ). People asking for collections of coverages ( https://github.com/opengeospatial/ogc_api_coverages/issues/8 ). Organization of partial 3D datasets by state, city, neighborhood is also a common use case. Thematic organization of feature collections is another.
No disagreement here. Maybe the best would be to close here once we've opened some new issue(s) that reflects the status quo and presents the actual issue at hand with greater clarity? That's more or less what I was suggesting re: distribution and hierarchical collections is no different.
The SWG agrees that this issue was addressed by PR 149 and can be closed: Moved: @cportele Second: @jeffharrison NOTUC
This issue has been created as part of the public review and is based on document 19-072.
Dataset and distribution are defined as a term in clause 5, but their relationship to the resource types specified by Common is not discussed. "dataset" is still mentioned in several places, but this seem to be leftovers from the old Features draft that was the basis for Common. At least the references to "dataset" are unclear. "Distribution" is used as "distribution of a resource" which is in conflict with the definition of the term in 5.4, where distribution is specific to datasets.
I also think that this fuzzyness plays a role in the confusion about the "collection" resources in Clause 9 (there are several issues about this already). In Features there is a clear hierarchy of dataset > collection > feature, but the collection in Common is left "hanging in the air".
It is also not clear why geospatial data has to be organized as collections. Are we sure this is always appropriate? Ongoing discussions like in #47 strongly indicate that this is not the case.
In addition, we can think of resources that are typically not considered as "geospatial data" that could use the collection resources, which is also recognized in 8.4 and 9.3, but in a restricted sense. Records that are often/sometimes not really "geospatial data" in the current sense of the draft (e.g., an entry in a code list or a service metadata record) are neither "Spatial Resources" nor "Information Resources".
Either all discussions of data, datasets and collections should be removed from Part 1 of Common or the requirements class "Collections" should be redesigned addressing all these topics.