opengeospatial / conceptual-modeling-group

1 stars 0 forks source link

Coverage modelling and schema harmonization #2

Open jyutzler opened 2 years ago

jyutzler commented 2 years ago

This topic is pulled from an email discussion on the Architecture DWG mailing list.

From Emmanuel Devys: In the close future, there should (IMHO) be also something about Coverage modelling and schema harmonization, with the emergence of CovJSON schema under the OGC. May be this could be handled under the Coverages.SWG, but this has not been confirmed to my knowledge. And I guess Coverage aficionados / users / implementers need some clarity or guidance on this: to be more precise, how to shift / convert between CIS schema (whether in GML or JSON) and CovJSON, or how to encapsulate CovJSON under CIS (if the assumption that CovJSON is an encoding – like netCDF or GeoTIFF – is correct).

jyutzler commented 2 years ago

From Kathi Schleidt: To the Coverage muddle, very good point. I've pointed out in one of the coverage sessions that to my understanding, CovJSON could easily be aligned with 19123 at least on the conceptual level, how easy/hard it is to align the encodings (so aligning CovJSON with CIS) would follow from that base alignment. From my understanding, the current sticking points are that:

  1. CovJSON is only available as YAML encoded JSON Schema snippets so difficult to align (if we had a UML version of CovJSON, would be far easier)
  2. Nobody from the CovJSON community is willing to engage
  3. Nobody from the Coverage SWG has the resources to do this work for them

If you see potential towards fixing these issues, I'm quite happy to use the conceptual modelling group as neutral ground to sort this issue

jyutzler commented 2 years ago

From Emmanuel Devys: Hi Kathi Thanks for feedback. I promise to try and engage in the conceptual modelling group as far as possible (before retiring – end 2021). For CovJSON and the Coverage harmonisation (or muddle), it seems there is an on-going ballot for a standardisation work item submitted by UK Met Office, with 6 supporting members (including UK Met). Though it is not clear in the justification document (21-040r3), it seems that the aim is some work item for finalizing the CovJSON Community standard (I can’t imagine the spec. - in draft 0.2 status, dated somewhere in 2016 – to be published as is by OGC, as there are some obvious editorial issues). Though the nice thing is that it seems to work and be used…. Therefore I do hope that the CovJSON community is willing to engage in this effort (which is announced to be < 6 months) – for achieving an OGC publishable document.

For the harmonization effort, I guess the effort should be done / shared both by CovJSON community and Coverages SWG. IMHO such an action should be a condition for accepting this proposed CovJSON standardisation work item as an OGC Coverage-related standard. Hopefully the OGC as a standardisation body will ensure such an action is on the POW, and the conceptual modelling group may serve as a neutral ground coordinating the CovJSON community and Coverages SWG and sorting this issue. Clearly, if none of these 2 communities is engaging, it will be really problematic. The need of some reverse-engineered CovJSON UML model should not be a showstopper, I guess that in the worst case a group such as the conceptual modelling group may reverse-engineer the CovJSON schema and submit the result for validation to the CovJSON community.

What scares me in the current context (and on basis of the justification document for CovJSON) is that this harmonisation action is just an hypothesis, and if no action is done by CovJSON community and Coverages SWG, we’ll have at least 2 different Coverage JSON schemas in the OGC, plus the GML encoding of CIS, and no guidance for ensuring interoperability / mapping …

I apologize that I have no solution to recommend. I would only strongly recommend – if I may - OGC Coverage aficionados / users / community to push in order to mandate such a coordinated action (with CovJSON community or submitters), and to stay engaged / review.

Confident that such a relatively easy effort may be undertaken

jyutzler commented 2 years ago

From Peter Baumann: Hi Emmanuel,

I share your concerns.

Again and again I have emphasized the danger in having two different, incompatible definitions for the same term, coverage. Imagine we now overlay the terms feature, sensor, ...or OGC ;-)

The CovJSON vote attempts to freeze it so that no harmonization with the OGC coverage model is possible any longer.

And indeed, the CovJSON community has shown zero active interest in looking at a harmonization with the OGC Coverage ecosystem which has an elaborate framework for modeling coverages, encoding them in a variety of formats, and various APIs for services (WCS, OAPI-Coverages). In contrast. CovJSON is simply a JSON encoding (which exists already with OGC Coverages) plus a rather generic EDR with little functionality.

Sad but true, we are on the verge of getting a standard fundamentally incompatible with mainstream standardization in OGC, ISO, and INSPIRE.

no cheers, Peter

jyutzler commented 2 years ago

From Chris Little: Dear Modelling colleagues,

I disagree strongly with some of the statements below. CoverageJSON enthusiasts are willing to engage – there is a current TC e-vote to initiate this work. Please vote! Either way.

The business justification proposes a plan to undertake the harmonisation work, if feasible, and develop a roadmap for CoverageJSON. This is all proposed inside OGC, and hopefully in the Coverages SWG. Outside of OGC, there is no leverage to get the CoverageJSON supporters to do the work, but inside OGC offers them something.

If the modelling and coverage experts wish to start sooner than the end of the vote, I am happy to engage and help with the work, but was planning to wait until the vote actually finished. I have now set up a session (ad hoc) at the next OGC Virtual TC to take this forward. But the first steps will have to be to get CoverageJSON ‘owned’ by OGC.

jyutzler commented 2 years ago

From Peter Baumann:

Hi Chris & all,

with all due respect, this argumentation is turning the situation around: any spec needs first to be harmonized and then considered for adoption. That adoption perspective should be incentive enough for the CovJSON supporters to contribute actively, I am still hoping for that to happen! I will personally make sure they get ample room for discussion in meetings, etc. It's the way it always was (good) habit in OGC.

jyutzler commented 2 years ago

From Scott Simmons: All,

Please do consider the OGC Innovation Statement approved by the Planning Committee back in 2014 the encourages OGC to address the Innovator’s Dilemma by considering multiple ways to solve a problem AND to work toward harmonization. So long as we have those objectives in mind, I think that the sequencing of work is less critical.

I’ve pasted the statement below.

Best Regards, Scott

In order to simplify technical complexity and reduce implementation costs, the OGC strives to ensure harmonization within the OGC standards baseline. In an unchanging world harmonization would be easy. However, given the realities of the diversity that comes about due to changing technology and markets, OGC must address the innovator’s dilemma of maintaining the current OGC standards baseline while simultaneously developing standards to support evolving and potentially disruptive technologies, community needs and market trends. The OGC must balance maintenance, adaptation and evolution of its standards and associated best practices in order to address technology change, market change, and the complexity of collaboration between different communities.

To support this challenging environment, OGC:

  • Will encourage harmonization of its standards
  • Will extend or adapt its present standards baseline, or work with its partners to adapt or extend their standards
  • May advance new standards that overlap with or diverge from existing standards, along with guidance regarding how to evaluate and select among these options.
  • May develop harmonization techniques such as bridging, brokers, or facades to achieve interoperability within and across communities of interest.

Will foster an environment that encourages fair consideration of all submissions.

jyutzler commented 2 years ago

From Kathi Schleidt: Dear Scott,

Many thanks for bringing us back to the roots, "OGC strives to ensure harmonization within the OGC standards baseline."

It’s clear that in order to keep pace with this rapidly changing world, we must remain open to new ideas, be willing to adopt new approaches. However, the faster we get, the less time we have to investigate what's already been done. We're all aware just how painful it can be to understand a standard, apply and extend it to one’s own requirements – it’s so much easier to just freestyle it to what one needs now, how standards like CovJSON were born.

Brings me to Point 1: it seems OGC is missing some essential form of outreach, providing support to SWGs both in understanding existing standards, as well as working with existing modelling tools and models.

To your statement "that the sequencing of work is less critical", fear a lot of us have been burned by the EDR experience. When one tried to engage in the process, one was not welcome (statements on EDR claiming to be "data model agnostic", disregarding the fact that JSON Schema Snippets are actually sloppy data models). When one made statements after the process, one was told that too late (we've already printed the flyers). As CovJSON seems to be pushed by the same community, there's a worry that the process will be similar :(

Brings me to Point 2: In order to foster innovation while maintaining harmonization with the standards baseline, I believe it would be most valuable for groups proposing new standards to first identify how they relate to which existing standards.

To the statements made that "CoverageJSON enthusiasts are willing to engage", while I can only speak from my own experience, I fear I must contradict a bit. We've tried several times, and admittedly some of the sticking points came from OGC side, e.g., getting the UML models for 19123 not yet introduced to the ISO stack available as EA files, now done. Unfortunately, nobody within the relevant SWGs also has the time to analyze and digest CovJSON, ideally provide a UML representation for comparison purposes. The offer to work on alignment once we have versions we can work with has been on the table for a while now, still waiting. The true pity is that a comparison seems to have been done between CovJSON and CIS, but not made available.

In conclusion, to my view, the real power of OGC is allowing us to stand on the shoulders of giants. However, it's a pretty steep crawl until you reach the shoulders, so maybe we could a) make this crawl a bit easier while b) pointing out to those who don't want to bother with the effort of the climb that maybe they're missing some perspective from where they’re standing on the ground ;)

:)

jyutzler commented 2 years ago

From Rob Atkinson: Having been concerned with the potential for divergence of approach since the early DTD vs XSD wars I sympathise and understand the experience.

Without having a perfect solution or means to create it overnight, the work we are doing looking at publishing more details around specification inter-relationships as well as identifiers and semantics of model elements is the best "building block" option we have been able to pursue - process, tooling and support to address the business problem will need further commitment by all.

The point I have been trying to make is that alignments and relationships is not natural to UML at all - it requires you to eat the whole elephant rather than just reference your friend's elephant by name - and finding and assimilating different nuanced versions of UML is a huge barrier. We may be able to port everything to a well governed registry of normalised UML - but thats a very significant task before we can get started. This is why I feel we can find a lighter weight approach by mapping all the bits to URIs and then making statements about how they relate in a "natural" language for relationships. This doenst preclude well-governed islands of similar UML - but it doesnt require them to be assimilated into a master model before such statements can be done. It also allows for light weight access using tuned APIs to views of this for integration into other processes - such as Asciidoc based specification writers etc.

The last few months there has been a lot more support for this approach as it emerges out of "skunk works" status into a "candidate for adoption as a more formal part of OGC's operational approach" - but please be aware that is where we are at the moment - and articulating member needs in this way is a hugely important part of that journey - so thank you!

samadammeek commented 2 years ago

@Rob (apologies, I don't have your Git des). I appreciate the questions regarding appropriate use of modeling language, however perhaps we could start with UML because of the work involved in harmonizing as it is. If we have to map all of the standards involved here (CovJSON, Coverage SWG, others) to something that they are not already in, doesn't that double the work? I think the OGC needs to look seriously at the semantic work, but equally these concerns seem time sensitive (if there is such a thing in an international standards body!)

rob-metalinkage commented 2 years ago

@samadammeek - I am going to reflect your own logic back to you here :-)

I admire your confidence you can "start with UML" - but in my experience its unlikely all the UML will be discoverable, well managed, compatible, internally consistent (much of it is "diagrammy UML" - if it hasnt been used to derive implementation artefacts its usually full of internal data inconsistencies. Last time I tried anything like this the ISO "Harmonised Model" has been a mess of incorrect or cumbersome mutual interdependencies requiring you to use the whole model or nothing. (maybe its been quietly fixed but I haven't seen much discussion - and my team did write tooling when I was back in CSIRO to analyse and expose hidden dependencies, so I would have expected noise around this if someone was re-inventing that wheel.. )

Mapping existing UML to URIs to support "just in time" integrative statements is actually relatively simple compared to rebuilding and collating UML models - I actually thing its a useful starting point for a tighter integration - think of it as an audit of what you would need to achieve, expressed in a machine readable way and forget about the bigger issues of perfect semantic representation of the contents...

jechterhoff commented 2 years ago

The topic of this issue is "coverage modelling and schema harmonization". It feels like we are digressing from that topic and moving towards a discussion of which modeling approach should be pursued in general.

Regarding coverage modelling: I'd like to learn how an actual coverage can be modelled conceptually, with full definition of the coverage contents (much like in application schemas, in the non-coverage case, there can be abstract feature types which define general concepts and information content, and then non-abstract feature types that define the full application-relevant content). Maybe coverage experts can tell us how they did that so far - if they did at all?

Regarding coverage schema harmonization: I'll stay out of the CovJSON vs. CIS discussion, and leave it to the respective experts and communities. On the topic of harmonization, I think what you, @rob-metalinkage, are suggesting is that using a semantic approach, relationships between terms used by CovJSON and CIS could be defined. Is that what you are suggesting?

KathiSchleidt commented 1 year ago

@rob-metalinkage To my view, your Building-Block approach seems like a sane way forward out of the modelling dilemma described (on the example of coverage dialects) above. Could you please provide links to more info on the new OGC Building-Block paradigm, then we can close here

rob-metalinkage commented 1 year ago

Hi, there are two paths here where schema annotation can be used: 1) annotated building blocks 2) "semantic uplift"

in the first, it requires decomposing schemas into building blocks to meet that paradigm - although I guess we could see CoverageJSON as a building block in its entirity.

in both cases, we can develop a JSON-LD context to "uplift" a schema to map it to a set of URIs - and then use transformations using these references, share such transformations, and validate the content is isomorphic. We are developing a "playground" application for this.

As with most data models - the trick is not the schema - its the content however....

I'd attack this with RDF-Datacube to model the dimensions of a coverage in a canonical way - and see how far you get. At any rate its probably worth the community first agreeing on exactly needs to be demonstrated here and providing a set of test cases.