w3c / sdw-sosa-ssn

Repository of the Spatial Data on the Web Working Group for the SOSA/SSN vocabulary
7 stars 5 forks source link

Modeling remote sensing for discovery purposes #37

Closed rduerr closed 1 month ago

rduerr commented 2 years ago

I have a modeling problem that I'd love to have suggestions about. I am working on a project that involves all of NASA's Science Directorate data. The range of kinds of data is huge since the divisions include Astrophysics, Planetary science, Heliophysics, Earth Science, Biology and Physical science.

I really don't want to re-invent any wheels and would rather re-use existing ontologies to the extent practical. Consequently, I would like to be able to use SOSA/SSN for its handling of sensors, platforms, features-of-interest, observations, and observable properties, all of which are the kinds of things people use as discovery/search terms (or in some cases would like to be able to use as discovery/search terms). I'd also like to use DCAT for its dataset, etc. concepts (more discovery/search criteria) and a number of domain specific ontologies, such as those from the OBO Foundry for more detailed information about variables measured, phenomena observed, etc. (even more discovery/search criteria).

However, I am running up against a couple of issues:

  1. While an individual observation seems to be the point of SOSA/SSN, for my purposes a large collection (perhaps many millions) of individual observations are all associated with the same sensor, platform, set of features-of-interest, observable properties and types of results. All of this collectively would be considered a dataset or some equivalent name.

As an example, in multispectral remote sensing (think MODIS though there are many more complicated examples), an observation results in many products (think DCAT datasets) each with some set of features-of-interest and observableProperties where that set does not change across the whole set of observations. Yes, the set might slowly grow over time as new algorithms and discoveries are made; but in general the set of features and properties holds over the entire set of observations (in MODIS's case 5 minute images).

Historically there have been two MODIS instruments. One MODIS isHostedBy AQUA and the other isHostedBy Terra. Each MODIS observes ObservableProperties like snow cover, ... etc., though I think SOSA/SSN would prefer the property just be the 32 bands of data MODIS acquires for each image. Each Observation has a full set of FeaturesOfInterest (i.e., presence of snow, snow fraction, etc.) for each image taken, though I think SOSA would just consider the FOI to be some area of the Earth. Each Observation hasResult one of each kind of data product containing some subset of those FeaturesOfInterest.

I think there was a concept of an aggregation of Observations that would handle this; but it isn't obvious what happened to that concept. If that doesn't exist, then is it OK to use SOSA terms for these sorts of aggregations?

  1. SSN has very strict only, exactly 1 type relationships between SOSA terms; though usually these are additional axioms applied directly to those SOSA terms. Those relationships are way too strict for many of the types of data I am dealing with. For example, GRACE involves two identical spacecraft making laser ranging measurements between themselves in order to map the Earth's gravitational field. On the other hand, the plain SOSA ontology really doesn't have axioms; but for my application could use several which are similar to the SSN axioms minus the exactly 1 type relationships. However, since SSN's additional axioms use SOSA terms; I've had complaints that that might constitute "ontology hijacking", even though we would be declaring our axioms in our own files/namespaces... Is this a problem? Could the SSN constraints be relaxed to be more generally applicable?
lvdbrink commented 2 years ago

Take a look at the extensions to SSN here: https://www.w3.org/TR/vocab-ssn-ext/

It defines a concept of collections of observations. This is a working draft, intended to be a published recommendation by the end of 2022.

dr-shorthair commented 2 years ago

Indeed - your example @rduerr is exactly one of the scenarios that SSN-ext is designed to manage. See homogenous collection of observations for some examples - Figure 5 shows a time-series of 4 scenes, but could be scaled up to many more of course.

dr-shorthair commented 1 month ago

@rduerr please note that Observation-collections, Actuation-collections and Sample-collections have been added to the SSN Ontology, largely following the patterns from ssn-ext and OMS.

Please look at the latest editor's draft https://w3c.github.io/sdw-sosa-ssn/ssn/#Collections
and use the latest versions of the ontology

If you have any feedback on this that can improve it that might help.

sgrellet commented 1 month ago

@rduerr 2 interesting publications which I supposed are linked (one person in common on IRIT side). Note : IRIT is in Toulouse and often works with CNES (thus ESA related projects)

An attempt to model satellite data (SENTINEL) using ontologies (including SOSA) In @fr but through the diagrams an example you'll get the idea

I wish I had more time digging on this

dr-shorthair commented 1 month ago

Ping @rduerr - if no response by 2024-06-14 I propose to close this issue

rduerr commented 1 month ago

@dr-shorthair Gosh, I wish these changes had come out while I was still working on that NASA SMD project! They totally aren't relevant to any project I am working on now. Though, as you might suspect, they will be useful for any large satellite project, especially those that have some form of repeat or stationary orbit. I took a brief gander at the updated editors draft and while I really can't afford to think about it in great detail, I can say that it looks good - at least a great start!

dr-shorthair commented 1 month ago

Oh well - next time, eh?