opengeospatial / om-swg

10 stars 6 forks source link

Observation Collection #2

Closed KathiSchleidt closed 2 months ago

KathiSchleidt commented 5 years ago

The need for a class for collections of observations, with optional values for all of the observation properties – to support sets of observations that have common values for one or more of procedure, feature-of-interest, observedproperty, phenomenon-time, result-time

Source: TC211 Comments AU001, PMG009

KathiSchleidt commented 5 years ago

Same as http://isotc211.standardstracker.org/show_request.cgi?id=111

KathiSchleidt commented 5 years ago

Also same as http://ogc.standardstracker.org/show_request.cgi?id=248

Simon: Patterns for collection features should support normalization of common properties of homogeneous collections Add an Observation Collection class, to enable common properties of a collection of observations to be (optionally) associated with a container object, instead of repeated on every member observation. Add sampledFeature property to SamplingFeatureCollection class, to allow this property to be recorded on the collection and inherited by all members Inefficient representation of homogeneous collections. Sections 7, 9

dr-shorthair commented 4 years ago

Proposed in https://www.w3.org/TR/vocab-ssn-ext/#observation-collection Implemented by TERN https://bitbucket.org/terndatateam/ternplotdata-ontology/src/master/

lvdbrink commented 4 years ago

Observation collections would be a very useful addition for several Dutch information models for the underground (groundwater quality/quantity, borehole research etc).

In ISO change request 111 there is also mention of this proposed change:

An explicit property on observations to link to a description of the sampling strategy (perhaps a sampling-feature) alongside the feature-of-interest.

i.e. a relation FROM an observation TO a sampling feature.

For us, this would be a very welcome addition to the O&M model as well. Is this part of this change request or should I create a separate issue for this? (I assume not)

dr-shorthair commented 4 years ago

Again, the SSN ontology makes the relationship between Sample and the thing-that-it-samples more clear - instead of 'sampledFeature' the relation is called 'isSampleOf' https://www.w3.org/TR/vocab-ssn/#Sampling-overview

And to allow an Observation to point directly to the proximate-feature-of-interest and also to the ultimate-feature-of-interest, see #3

KathiSchleidt commented 4 years ago

@lvdbrink As ISO change request 111 referred to 3 separate issues, I disaggregated this into the following:

In addition, I've purposely left the Incorporate SOSA/SSN Concepts #19 open in case we decide to incorporate further concepts from SOSA/SSN

ilkkarinne commented 4 years ago

Extending the Plain Old Feature Collection (POFC?) by adding properties to can be problematic in cases where collections are used as part of the data provision API or the data storage structure: I would like to use the O&M Observations as the content model for OGC API - Features for example, which already has collections of items as part of the API structure. Neither being forced to extend the API - Features spec to return Observation Collections instead of any items, nor providing Observation Collections as items seems to fit the case very well.

Instead I would like to propose using the template pattern: We would add an optional "template" property to the basic Observation class pointing to an ObservationTemplate class holding only the shared properties. In this way any Observation could leverage the shared properties regardless of whether it's contained in a specific collection. This solution would obviously only reduce property duplication in cases where there are more than one shared property.

This would also bring the data models of the SensorThings and the O&M Core closer: the Datastream of the STA can be seen as an implementation of the ObservationTemplate as outlined above.

ilkkarinne commented 4 years ago

Adding properties to the FeatureCollection is also forbidden in GML Simple Feature Profile 2.0

KathiSchleidt commented 4 years ago

To my understanding, the appliedTemplate concept proposed in #18 pertains to collection concepts, please continue this discussion here

ilkkarinne commented 4 years ago

After rethinking this after the 22 Jan SWG meeting, I'm now in the opinion that defining ObservationCollection in O&M conceptual model makes sense. The implementations could still decide not to use ObservationCollections (like in cases of OGC API - Features etc.), but rather plain feature collections and template/cascade (using the relatedObservation) to avoid repeating identical property values.

The question I currently have in mind is about the semantics of the ObservationCollection: In SSN-ext the ObservationCollection the properties given at the collection level are applied to all members, thus they are truly shared between all the members (whether duplicated in members or not). The cardinalities of the collection level properties have also been given an upper limit of 1.

During the SWG discussions collections have also been proposed as more generic Observation containers, so that the collection level properties could be a superset of the member property values, like the unique set of different observedProperties in members.

dr-shorthair commented 4 years ago

Indeed. The ObservationCollection in SSN-ext is conceived rather narrowly. They are explicitly homogeneous on at least one axis. You could get more than one observableProperty (etc) in their by making the one value a complex value (this is a reasonable u-c) but it is not intended to just provide discovery metadata for a a general bag of alternates.

ilkkarinne commented 4 years ago

In the most recent draft UML model the ObservationCollection is specialised from the AbstractObservation class. However conceptually I feel that this is wrong, as ObservationCollection is not an "observation", or even "event" thus it does not fulfil the "is a" semantics of the class specialisation.

What we would really need as a common bit is class that would model a generic set of Observation property values that could be used by the AbstractObservation as well as the ObservationCollection. But I do not yet grasp how to model the inheritance tree from this "set" of properties to AbstractEvent, AbstractObservation, ObservationCollection and Sampling.

The SSN-ext style homogenous ObservationCollection would not have to specialise AbstractObservation, but instead contain a dedicated property named "sharedProperties" etc. pointing to the AbstractObservation. This would also help to communicate the applied collection level property semantics (shared between all members). However, preferably this property should also point to a set common properties, not an Observation instance (even an abstract one).

ilkkarinne commented 4 years ago

I think we have to carefully consider the implementation vs. conceptual level distinction here too: conceptually it is not an issue if we need to repeat definitions of the same properties in two (or more) classes in cases where the semantics of these property sets in their context are different (own properties vs. properties applied to all or some of the members). However, from the implementation convenience point of view it would be nice to only have to define the set of possible properties only once and reuse the same set definition everywhere where the same set needs to be provided.

sgrellet commented 4 years ago

after SWG meeting 05/02/2020 : we apparently have 2 use case on the table

sgrellet commented 4 years ago

after SWG meeting 05/02/2020 :

dr-shorthair commented 4 years ago

Note that sometimes, but not always, an individual ObservationCollection may also be an event or 'occurrent' - i.e. a prov:Activity. In RDF it is fine to assert that an individual is a member of more than one class from different hierarchies - it just says there is a class intersection of which this individual is a member. (I know this is not so easy in UML-based systems.)

ilkkarinne commented 4 years ago

One additional semantical variation of the set of properties of an ObservationCollection: in SSN-ext, if a property (value) is given at the collection level, it's always shared by all members (no Observation level overriding). As described by @dr-shorthair in the SWG meeting of 29 Feb this design decision was intentional in the latest version of the SSN-ext, as implementing the overriding logic would create unnecessary complexity in the implementations.

I do see use cases for allowing Observation instance level overriding of the collection level default set of property values, but those that I have in mind tend to be implementation level issues (avoiding repeated values). At the same time I do understand the rationale behind requiring truly homogenous collections, as it saves the time and trouble to check if the certain property values actually vary within the collection.

Thus we could classify the Observation collections in three categories by the applied semantics for the collection level properties:

  1. A truly homogenous collection, where all the members are guaranteed to share each of the collection level property values (the current SSN-ext semantics).
  2. Collection with default values for the chosen set of member properties. If the Observation does not define a value for the particular property defined at the collection level, the collection level property value is considered to apply to the member in it's entirety. If the member does define a value, it's considered to override the collection level property in it's entirety.
  3. Collections containing the full range of the values of the chosen set of properties occurring in it's members. The each of the collection level property values must occur at least once in the member Observations.

I think we agree that we need the option 1 in the O&M conceptual model, but the other are ones are not clear to me yet. I would somehow like to allow all three variants to be used in particular application domains, so perhaps we could define an abstract ObservationCollection class as an extension point and a concrete HomogenousObservationCollection class with option 1 semantics?

dr-shorthair commented 4 years ago

Good, thanks @ilkkarinne

Indeed -

  1. is current SSN-ext - optimised for immediate processing
  2. is earlier SSN-ext - variation, likely to require additional prep for processing
  3. I see as primarily a discovery use-case
hylkevds commented 4 years ago

The current SensorThings API Datastream class does 3 for certain properties: phenomenonTime, resultTime and (with a changed name) observedArea, and this helps discovery quite a bit for large data sets.

ilkkarinne commented 4 years ago

@hylkevds: good point about he STA Datastreams. This makes sense for the spatio-temporal dimensions, as you can always provide the value range as a 1 or 2 dimensional space. In many cases the property values are discrete and not easily expressed in a numeric domain. We also discussed STA MultiDatasteams in this context with @sgrellet last week.

KathiSchleidt commented 4 years ago

Based on the discussion on Feb 19th 2020, Option 2 has been determined to be the Observation Template, covered in #33 Not relevant for modelling, revisit when creating XML serialization

ilkkarinne commented 4 years ago

In the SWG meeting on Feb 19 we decided to turn the ObservationCollection class into an abstract one with two concrete collection classes, and that both the options 1 and 3 are important enough for including into the UML model. The SensorThings Datastream can be seen as ObservationCollection of the type 3, so one use case is already there.

I'm proposing "HomogenousObservationCollection" as the name of the option 1 variant, but what would be a good name for the other? SummarizingObservationCollection, DiscoveryObservationCollection?

ilkkarinne commented 4 years ago

I'm seeing two different intended use scenarios for the option 3 ObservationCollections:

  1. Collections promoting the values or value ranges of the set of properties of the contained Observation instances deemed interesting by either the data provider or the application domain rules, and
  2. Collections constrained on purpose to hold and (dynamically) accept only Observation instances with the qualifying property values.

The type 2 could have the name of "ConstrainedObservationCollection", the type 1 is close to the Summarizing or Discovery collection of the previous comment. The key difference in a dynamic scenario would be that when a new Observation is added to the type 1 collection, the collection level value ranges should also be adjusted to cover the property values of the inserted Observation, when for the type 2, only qualifying Observations could be inserted.

ilkkarinne commented 4 years ago

Based on the discussion in the SWG meeting on 19 Feb I have now drafted two alternative class hierarchies for the ObservationCollections. Note that both are now based on the using the ObservationDescription class replacing the ObservationPropertyContainer interface in the previous draft:

  1. ObservationDescription as a common ancestor for both the AbstractObservation and the (abstract) ObservationCollection, see Edition2_playground_2020-02-24_alternative_Observation.png
  2. ObservationDescription as the common property container, referred to from the HomogenousObservationCollection (sharedProperties association) and the SummarizingObservationCollection (containedProperties association) , see Edition2_playground_2020-02-24_Observation.png

For the latter option, the ObservationDescription would not even be necessary IMHO: the collections could as well relate directly to the AbstractObservation for both the shared/containedPropoerties and the member association, as the class hierarchy would not longer imply that the collections are (Abstract)Observations.

Which option do you prefer?

ilkkarinne commented 4 years ago

I digged a bit deeper on the option where the ObservationDescription would not be needed: It turns out that it is needed because we need a concrete class to act as the container. However, the inheritance could be turned the other way around: We could have the ObservationDescription inheriting from the AbstractObservation but as DataType instead of FeatureType. This would make it clear IMHO that the ObservationDescription is not intended to be used as a feature by itself.

This third option is now available as Edition2_playground_2020-02-25_Observation.png

cportele commented 4 years ago

@ilkkarinne

Some thoughts and questions:

ilkkarinne commented 4 years ago

Indeed @cportele, the second option does not work by itself, without a dedicated concrete class without the constraints of the concrete Observation class. Thanks for the note on the third option, I was a bit afraid that deriving a DataType from a FeatureType is not allowed.

The Observation cardinality issues are still on the todo list, my gut feeling would be that we need to set the cardinality to 0..n for all of them, including the result, in order for the SummarizingObservationCollection to work

ilkkarinne commented 4 years ago

Resolutions made in the SWG meeting on 4th March:

The names of ObservationDescription, HomogenousObservationCollection and SummarizingObservationCollection classes were seen as clunky. Defining the textual descriptions for the classes would hopefully inspire better names.

ilkkarinne commented 4 years ago

I'm trying to break down the observation property constraints for the collection classes, and feeling quite uncomfortable. I have two specific issues that I would like suggestions to:

  1. Both of the collection classes are supposed to have associations to an instance of ObservationDescription with only the some of the possible associations and attributes given (cardinality > 0). For the homogenous one, the contract would be that only the associations and attributes with identical target objects or values in each member would be given. If the cardinality of an association or attribute is zero, the member values for this association or attribute could be anything. For the Summarizing one, all the distinct values for each association or attribute occurring in the members would be included in the ObservationDescription associated with the collection. If the cardinality of an association or attribute is zero, does it mean that the member values for this association or attribute could be anything or that none of the members have a value for this association or attribute? Also do the bounds for contained attribute or association values apply to each individual member? In other words, if the description contains two observedProperties and three ultimateFeatureOfInterest associations, does it mean that each member shall have an association to at least one of the three ultimate FoIs and at least one of the observedProperties?

  2. @KathiSchleidt strongly argued in the SWG meeting on 4th Mar that it should be possible to define an ObservationDescription instance as a stand-alone thing, which could then be reused for several, similar ObservationCollections (monthly, early aggregations, timeseries etc.). How would these shared descriptions work for Summarizing collections? What happens if the number of the distinct values for an association or attribute are different in the two collections (the other collection contains members with one additional observedProperty for example)?

dr-shorthair commented 4 years ago
  1. This is a policy decision. I'll leave it up to you guys to make a call. In the OWA assumed by RDF/OWL anything unstated is unknown, so if you want to set something (e.g. an attribute) to zero or missing you must state that explicitly. But in the UML world the assumption might be different

  2. I think the most standard way to implement that requirement in RDF/OWL would for the template or standard config to be defined as a class, with values of some properties pre-bound. Then instances would be of that type.

ilkkarinne commented 4 years ago

In the SWG meeting on 4 Mar @cportele asked if the intention the the shared and covered property values was to be derived from the collection members or if it should be possible that collections are "out-of-sync" with the member property values: could there be cases where an instance of a HomogenousObservation members could have a common property value even if that property was not explicitly given as a sharedProperty. If the associations would be derived there would be not need for explicit constraints for the shared/covered properties in the collection classes.

The key question to answer here IMHO is whether the collection level properties are seen as a constraints enforced on the possible members or as a supplementary information distilled from the contained members. The use case we see for these collections should in the end give us guidance on which kind we need.

I said in the SWG meeting that the associations would not be derived, but if we go down the "distilled information" path I think the derived association approach would work nicely with the following diagram notes:

For sharedProperties of HomogenousObservationCollection: "Derived from the properties of the members: if all members have the same value or set values for a particular property, that property and the values are expressed in sharedProperties."

For coveredProperties of SummarizingObservationCollection: "Derived from the properties of the members: all distinct values of properties occurring in members, except result and resultTime, are expressed in coveredProperties."

I would rather not make the result and resultTime specal cases in this, but including them would in most cases mean really bloated summarizing collections as all the result values of the members would have to expressed at the collection level.

Note that for the SummarizingObservationCollection the above association definition would mean that all the properties and their values ranges (except the result and the resultTime) would be automatically lifted to the collection level if any of the members contained any values for those properties.

One potential difficulty I'm seeing with this approach is the intended remote referencing of the shared or covered properties: the derivation would make the contained property values tightly coupled with the contained observation members at a particular time: the set of members change it may also cause a change in the collection level properties due to derivation. For this reason I would be templed to define the association as a composite. This would in essence prohibit several collections referring to the same instance of the ObservationDescription.

ilkkarinne commented 4 years ago

@cportele: Is there a super class (a generic FeatureCollection with the feature member association) in the GFM or elsewhere we should specialize our ObservationCollection class from?

cportele commented 4 years ago

Is there a super class (a generic FeatureCollection with the feature member association) in the GFM or elsewhere we should specialize our ObservationCollection class from?

@ilkkarinne - I am not aware of a general feature collection class that we could use. The concept of feature collections is certainly not discussed in ISO 19109.

hylkevds commented 4 years ago

I don't really see a problem with resultTime and result. There is no specification of how the summary is built. For resultTime (and validTime & phenomenonTime) this can simply be the time interval between first and last. For numeric results this can also simply be the range lowest - highest. I may be good to specify that the the property may be included in the summary, but does not have to be included. This way the cost/benefit may be evaluated for each use case.

ilkkarinne commented 4 years ago

@hylkevds: If we do not require that all the Observation properties (or a fixed set of them) would always have to be included in the summary, we would have to be able to explicitly state which properties are summarised for a particular collection instance, and which ones are not. If we fail to do this, I'm afraid the implementers will end up with situations where they have difficulties determining if a particular property values of their interest are included in a set of Observations instances included in a particular sets of collections or not not.

I'm struggling with options for providing this information without splitting the Observation properties into individual entities. If we would have an entity for a single property, we could then explicitly list the ones included in the summary, as well as the value ranges of those (including empty ranges).

ilkkarinne commented 4 years ago

I think that in order to define the class constrains for the collection classes it is necessary to distinguish use cases where

  1. a collection imposes constraints on its possible members and
  2. the cases where any members could be included in the collection and that the collection level properties reflect the property values of their current members.

In the latter case the derivation seems like a plausible solution. In the former case given collection level properties would be used to dynamically determine in an Observation instance would quality for as a collection member.

ilkkarinne commented 4 years ago

My latest proposal for the collections UML is now available under iso_19156_issues/ea Github folder. The changes are summarized as follows:

The latter work meant creating a long list of very similar constraints for the SummarizingMemberCharacteristics class: basically two for each ObservationBase attribute and association for cases where the instance cardinality is zero (no collection level information available for this property) and for cases where the instance cardinality is one or more (all distinct values shall be listed at the collection level). See a separate diagram for all the constraints visible.

This solution still has the issue I already reported before: It is not possible to describe the fact that a non-mandatory attribute or association does not exist in any of the collection members, because the zero instance cardinality is defined as "unknown". This seem somewhat error-prone to me as the software doing calculations over several summarizing collections would have to take special care to treat the "zero" cases: For example if we want to list all observableProperties used in a set of summarizing collections, and the observedProperty association for the one of them is empty, we cannot say for certain that the summed observedProperties contains all variations as the "zero" collection may contain Observations with any observedProperties.

Additionally I changed the type of the resultTime and stimulusTime in the ObservationBase from TM_Instant to TM_Object to enable using TM_Period for describing their value ranges in the SummarizingMemberCharacteristics. This means that each resultTime and stimulusTime would not have to be (but could be) provided as individual time instants at the collection level. These attributes are narrowed down to TM_Instant using constraints in both AbstractObservation and HomogenousMemberCharacteristics to keep the time instant semantics for individual Observations.

KathiSchleidt commented 4 years ago

In order to better understand the 2 proposed Observation Collections, we’re collecting use cases to see how the different collection types would represent this collection. For simplicity, we're collecting these use cases in the following doc: https://docs.google.com/document/d/1WkI60zqunKDcWNareE-Qffey42KZ_pbqPowwRElbm_E/edit?usp=sharing

hylkevds commented 4 years ago

I've added the SensorThings API use-case, and noticed that the Datastream is a combination of our two collections.

ilkkarinne commented 4 years ago

As per discussion at SWG meeting on 1 Apr 2020, I have now modified the ObservationCollection class to include a type attribute pointing to ObservationCollectionTypeCodeListValue codelist class, and removed the Homogenous and Summarizing member characteristics classes in favour of a single concrete MemberCharacteristics class.

The idea is to leave the collection member characteristics semantics open in the UML model, and provide a initial set of codelist values as well as their constraints in an external registry.

The most up-to-date UML diagram is currently at https://github.com/opengeospatial/om-swg/blob/master/iso_19156_issues/ea/2020-04-07_Observation.png

hylkevds commented 4 years ago

To me, it seems that, on a technical level, the Potential Observation / Observing Capability is another instance of Observation Collection.

The only difference seems to be a little bit of semantics...

In both cases, Observations that match this set may or may not exist. In both cases, implementations will enable querying these entities.

KathiSchleidt commented 4 years ago

Where do we store the constraints on Collection Characteristics? In the Collection or the Characteristics Also - can we formulate constraints to pertain to multiple properties

KathiSchleidt commented 4 years ago

@cportele Is there a way of specifying a constraint on multiple properties? when constraining down the observation characteristics, we repeat the same pattern MANY times (see https://github.com/opengeospatial/om-swg/blob/master/iso_19156_issues/ea/archive/2020-04-01_SummarizingMemberCharacteristics.png) It would be nice to formulate the constraint pertaining to multiple attributes or associations. For a set of Attributes, the following is true:

KathiSchleidt commented 4 years ago

We propose adding a related self-association to the collection. Potential Use cases:

cportele commented 4 years ago

@cportele Is there a way of specifying a constraint on multiple properties?

I am not aware of a way to do this in OCL. But we could write anything in a textual constraint.

ilkkarinne commented 4 years ago

I'm now experimenting with an idea that the ObservationCollection, ObservationCharacterics and ObservingCapability would be moved from the Observation core to the Basic Observations package. This would make the Core package much more of a true O&M core in the sense of including only the most essential things, with the Observation being the only concrete class there. Along the same line of thought I tentatively also removed the ObservationCollection interface from the Abstract Observation schema package.

This reflects my current thoughts on breaking the 1-to-1 mapping between the requirements classes and the UML packages (see discussion under #25): As an example there should be requirements class based on the Core package that requires using the Observation class, but does not require using a specialization of the AbstractObserver as the target of the observer association, any realization of the Observer interface would be allowed.

Similarly there could be a dedicated requirements class for Observation collections, that would require supporting the Core and the ObservationCollection and ObservationCharacteristics classes, but would not require supporting of any other classes from the Basic Observations package.

Any thoughts on this more than welcome.

Snapshot images for comparison:

KathiSchleidt commented 4 years ago

I like the idea of taking the collections of of the bitter core, but do worry about lumping it together with the basic types we're providing, thus wondering if collections shouldn't be its own requirements class, if not package. My understanding to date was that the basic observations is for simple users not up to deriving required classes their own - seems unfair to then dump the collections on them. I'd feel better with the collections in the core, but in a separate requirements class one can optionally include or not - is this now possible (don't quite understand where #25 landed)

ilkkarinne commented 4 years ago

Regardless of whether the collections are in the core or the basic package, we can make create a separate requirements class requiring the implementations to support the ObservationCollection, it's linking to the ObservationCharacteristics and the ObservationCollectionTypeCodeListValue classes.

As far as I understand we could even put the Core and the Basic in a same package if we feel that makes sense, and still have several requirements classes to state that some classes must be supported. However this several-requirements-classes-per-package works best IMHO if the classes included in a package are likely to be useful used together, even if interoperable systems were not required to support exchanging data using all of them.

Making collections a separate package would also mean separate namespaces. Thus I'm not too keen on creating a dedicated collections package, although it would be the cleanest option as at least for GML it's likely that the XML Schema for the each package would contain all types for all the classes in the package.

Basic Observations package depends on the Core, and thus if systems implement the entire Basic Observations package, they also must implement the core, and get to deal with the collections in any case.

If the ObservationCollection, ObservationCharacterics and ObservingCapability would not be included in the Core package, we my be able to create a single Observation Core requirements class

Then we could create an Observation Collections requirements class requiring the Observation Core requirements class and additionally the requiring support of the ObservationCollection, ObservationCharacterics and the ObservationCollectionTypeCodeListValue. I'm also ok to have these in the Core package if the SWG agrees that it's the best option.

ilkkarinne commented 1 year ago

Based on the early implementation experience of OMS in WMO Integrated Observation System Metadata Representation, and SWG decision on 21st Sep 2022, the O&M SWG has decided to re-introduce Observation collections as part of the Abstract Observation core package. The major reason is to avoid importing any requirements classes from the Basic Observation Core package in application schemas that need to extend the ObservationCollection UML class.

Following changes are proposed to be made in the model to fix the issue:

I have identified the following clauses in the document being in need of revision due to the change:

Have I missed something?

KathiSchleidt commented 1 year ago

Slight contradiction - as the final 2 sub-clauses of clause 9 are datatypes (9.9 NamedValue & 9.10 Codelists), I propose slipping AbstractObservationCollection up above these. As there are no figures under the current 9.9 & 9.10, this doesn't impact figure numbering