Homogeneity of an ObservationCollection

dr-shorthair commented 6 years ago

According to the proposal [1] an ObservationCollection should be homogeneous in one or more of

feature-of-interest ultimate feature-of-interest observed-property procedure sensor phenomenon-time result-time i.e. ObservationCollections can be homogeneous on different properties.

See discussion here [2] for confirmation that addition of this class might accommodate several existing use-cases, for example streaming.

While the requirement for 'one or more' are not yet axiomatized, that would be a relatively easy tweak. This issue to host discussion of whether these homogeneity constraints are enough or too much.

[1] https://w3c.github.io/sdw/proposals/ssn-extensions/ [2] w3c/sdw-sosa-ssn#7

lieberjosh commented 6 years ago

Datastream is quite "homogeneous" as it tries to represent single streams of data,

namely each observation in a Datastream has the same Sensor, ObservedProperty, and Thing (may be equivalent to sosa:Platform). It doesn't necessarily have the same FoI and generally does not have the same phenomenonTime or validTime. An alignment should have those homogeneity constraints.

I'm wondering if it might be helpful to create a Sensor Things ontology for the purposes of alignment.

dr-shorthair commented 6 years ago

Indeed - very homogeneous. All member observations have a common value for three properties (two pf which are direct SOSA properties).

I'm wondering if it might be helpful to create a Sensor Things ontology for the purposes of alignment.

In the SSN spec we didn't go that far for alignment to the O&M UML model, we just used the ISO 19150-2 URIs to denote the UML classes and properties. See https://www.w3.org/TR/vocab-ssn/#OM_Alignment_UML_URI

srodriguez142857 commented 6 years ago

Hi.

As mentioned previously, and to follow this discussion about "Homogeneity" for several observations, below are some comments from the BCI community:

Following @kjano opinion about that a collection may group observations with different FOI and/or taken by different sensors, this is exactly the case in Brain Computer Interaction (BCI) applications, where in a single "Session" a person (or "Subject") is being monitored ("Record", aligned to sosa:Observation) by (possibly) several devices ("Device", aligned to sosa:Sensor) for different biomedical modalities ("Modality", aligned to sosa:ObservableProperty).

For BCI, it's common that in a single "Session" researchers handle information for several sosa:Observation (and sosa:Actuation). Specifically for observations, BCI applications may monitor several "modalities" (EEG, ECG, motion capture, etc.) through different sensors for different FOI, all during one session.

Therefore, in some cases, some BCI sessions could be "ObservationCollection" but not all BCI sessions are "ObservationCollection".

dr-shorthair commented 6 years ago

Please read the proposal. It was always envisaged that a collection could be _in_homogeneous on any of these axes - e.g. a group of observations could have more than one foi - this is a common case for sensor networks, for example. But in order to be a useful collection, it is envisaged that it should probably be homogeneous on at least one axis.

In fact, these ideas go back to the early days of OGC O&M. Version 1[1] had a whole appendix dedicated to exploring different homoegeneity patterns.

[1] http://www.opengeospatial.org/standards/om

srodriguez142857 commented 6 years ago

I have read the proposal [1]. As you indicate, I understand that a group of observations could have more than one FOI, but that it should be homogenous on at least one axis (sensor or observable property).

This nature of "homogeneity" in a group of observations is true for some BCI applications, but not for all of them: there are some BCI applications that could have a session with multiple sensors observing different modalities (observable property) for different purposes (FOIs) to measure the dynamics of the human body reaction while performing a single activity (for example, driving a car) in a single context (environment) [2].

I understand that the concept of ObservationCollection may be useful in many different types of applications, including many in the BCI domain. I'm just pointing out that it may not apply to a broader concept that groups inherently heterogenous (but related) observations.

[1] https://w3c.github.io/sdw/proposals/ssn-extensions/ [2] https://w3id.org/BCI-ontology#Session

dr-shorthair commented 6 years ago

Ok, I think we are on the same page then.

My concern about your last point is whether there is anything in common within the kind of collection that you envisage? If so, but it is not one of the properties already mentioned, then should something be added to the model to capture it? If not, then how is the collection characterised?

dr-shorthair commented 6 years ago

@kjano noted that the proposal has a cardinality constraint that there will be no more than one of the characteristic properties* on an ObservationCollection, and asked what that means for a collection in which the members are inhomogeneous on one or more of these axes.

The intention is that a value for one of these characteristic properties would be given at the collection level iff the members share a single value for the property. For any property that is not homogeneous in the collection, the value should be given to each individual member and not to the collection. The ObservationCollection can make discovery easier and encoding more compact with respect to homogeneously-valued properties. It is not intended to enumerate all the values of all the characteristic properties of all the member observations.

* feature-of-interest ultimate feature-of-interest observed-property procedure sensor phenomenon-time result-time

srodriguez142857 commented 6 years ago

About your questions:

a) Is there anything in common within the kind of collection that you envisage? Answer: Yes.

Reasoning: In the BCI domain, a single Session [1] groups multiple heterogenous observations (with different sensors, FOIs, and observable properties). All the observations that "belong to" (were observed during) a session, correspond to (common axes):

exactly one Subject [2] (a human being or person),
performing exactly one Activity [3] (focus on the person's physical state),
exactly one Context [4] (architectural description of the environment where the person performs the activity).

In a simple way: a session groups multiple heterogenous observations of the same person performing one activity in a specific context. All the observations are heterogeneous because they monitor different parts of the human body dynamics (brain, heart, etc.).

For example, a session for monitoring Alice (subject) while studying (activity) in a coffee shop (context). For this session we could have the following observations:

one observation that monitors the brain with FOI=Cognitive Aspect (learning) [5], ObservableProperty=EEG-ERP-SSVEP [6], Sensor=EEG-Sensor.
another observation that monitors the heart with FOI=Anxiety, ObservableProperty=ECG, Sensor=ECG-Sensor.

b) If so, but it is not one of the properties already mentioned, then should something be added to the model to capture it? Perhaps not. I think that the definition of "ultimate feature-of-interest" could capture the homogeneity of a BCI session.

c) If not, then how is the collection characterized?

From the BCI domain, we may think that the "ultimate feature-of-interest" goes all the way up to the whole human being (a person). Although ultimately this is true, however, if we modeled it in that way it would be too vague and the semantics regarding classifying human dynamics and their signals (measurements) would get lost.

Also, in order to keep a characterization of the descriptive features about the metadata collected during a BCI session, we included the concepts of activity and context, due that the physical states of human beings vary in real-life situations (personal and circumstantial variations). This is useful to identify the profiles and trends of human dynamics (such as brain signals) among real-life activities.

Having said that, as you can see the homogeneity of a BCI session lies on the axes of person, activity, and context, which are out-of-scope in SSN/SOSA.

Based on the proposal [7], and from the perspective of BCI, one possibility would be to model a session as an ObservationCollection, where its homogeneity would be in the "ultimate feature-of-interest" axis, which would be a "subject-activity-context" composite entity (although this would not have a useful meaning, just for the sake to align properly to [7]).

Last, I think that dividing the structural composition of the "ultimate feature-of-interest" it's entirely application dependent, in order to keep its intended semantics in its domain of discourse:

BCI: "subject-activity-context" composite entity.
Smart Living: "an entire building", etc.

A suggestion: Would be useful to include some general guidelines (based on ontology design patterns) on how to model an "ultimate feature-of-interest"?

[1] http://w3id.org/BCI-ontology#Session [2] http://w3id.org/BCI-ontology#Subject [3] http://w3id.org/BCI-ontology#Activity [4] http://w3id.org/BCI-ontology#Context [5] https://w3id.org/BCI-ontology#CognitiveAspect [6] https://w3id.org/BCI-ontology#EegModality [7] https://w3c.github.io/sdw/proposals/ssn-extensions/

dr-shorthair commented 6 years ago

Thanks @srodriguez142857

Your responses (a) and (b) confirm my expectations - an ObservationCollection as described in the proposed extension is a useful concept.

Your response (c) is also useful, As a general purpose x-domain model, SSN/SOSA provides a standard vocabulary meeting some common requirements, and the extension is also cast in this light. But it is unlikely to comprehensively satisfy the full needs of any specific application, so extensions are to be expected. The particular aspect that you analyse - how far up the chain to the ultimate-feature-of-interest? - is valid, but there is not a single answer and it is hard to imagine how this might be satisfied in a compact form to satisfy all the imaginable use-cases. So I would suggest that you use SSN/SOSA and SSN-EXT as a jumping-off point - use the standard words where they apply, and introduce new ones where they don't.

There is also an important point that, in the end, it is the data-provider's prerogative to supply their data in the way that they understand it and to the level of detail that they choose. And it is the data-user's prerogative to re-interpret it and use it as they choose, within the limitations of any license.

dr-shorthair commented 1 year ago

OMS (ISO 19156:2023) has an attribute type on ObservationCollection whose value must be taken from the set ( "homogeneous" , "summarizing" ) - clause 10.12.1.

The rules for members of the collection are different for these two cases:

members of a homogeneous collection share a common value for any property of the collection, so there may only be one instance of each property on the collection
members of a summarizing collection must take one of the values of properties of the collection, so each of the properties on the collection is repeatable

In the RDF implementation these requirements may be better implemented as different classes. There are different cardinality constraints, in particular. Separate classes could also make writing SHACL rules easier.

w3c / sdw-sosa-ssn

Homogeneity of an ObservationCollection #12