opengeospatial / sensorthings

The official web site of the OGC SensorThings API standard specification.
135 stars 29 forks source link

Clock synchronisation between Sensors/Things #47

Open abourdon opened 6 years ago

abourdon commented 6 years ago

Hi,

We have a use case that uses more than 2 Sensors that send Observations about the same ObservedProperty. To have a global view of all Observations about this same ObservedProperty, we then have to consolidate all Sensor's Observations.

But to correctly do that, we have to make sure that all Sensors (or associated Things) share the same clock. This way, we will be able to sort Observations against their phenomenonTimes or resultTimes. Otherwise, it would be difficult to have a temporal view of our data.

A possible solution to our problem would be to have a reference time for each Observation. Thus, whatever the time offset between multiple Sensors Observations, we will be able to synchronize them with this reference time.

A simple implementation of this solution could be to add a createdAt (or name it as you want) Observation's property that will be set before any Observation is sent to our SensorThings service. This way, we will need to have an ad-hoc solution (any solution could be imagined) that set this reference time anytime we send an Observation.

However, this implementation needs to have this ad-hoc solution, and so anytime someone will face this issue, he will need to have this additional layer, that could be a drag on the use of the SensorThings API.

A better implementation could be to have this createdAt value as an Observation's attribute (so handled by the SensorThins API model and no more as an implementation-specific property). This way, this value could be set either:

What do you think about it? Maybe I'm totally wrong and there is another way to do it, that's why I will be happy to know your opinion on it :-)

Regards, Aurélien

mjacoby commented 6 years ago

To my understanding, clock-synchronization is way beyond what the standard covers and probably should cover.

However, there might be a simple solution to your problem. According to the standard, Observation has a property phenomenonTime that describes a time instant or period when the observation happens. If it is not set when creating an observation, the server assigns the current server time. (see description in table 18 here http://docs.opengeospatial.org/is/15-078r6/15-078r6.html#31)

So basically this is exactly what you propose with the createdAt property, isn't it?

abourdon commented 6 years ago

@mjacoby, yes but only using phenomenonTime is not sufficient. There could have a Sensor that already set this value but not all Sensors. By using a new attribute, we can address both cases.

I would like to have a more general solution to address this issue whatever the possibilities of a Sensor.

abourdon commented 6 years ago

@mjacoby The question is: if this problem is a common issue when using the model, would not it be interesting to take it into account by the model instead of letting users develop their own solutions?

liangsteve commented 6 years ago

@abourdon Thanks for providing the background of the potential issue. Can you please write a precise description or a definition of this attribute you propose? It will help the SWG to understand the proposed attribute better. Thanks.

abourdon commented 6 years ago

@liangsteve To sum up my proposal I would say:

Create a new Observation's attribute, named createdAt, that corresponds to the time this Observation has been created from the SensorThings server side. This value must be set by the SensorThings server itself by using the current SensorThings server's time.

Thinking about this further, I finally don't think it is necessary for a client to override it. This way, value will not be provided by client but always generated by the SensorThings server and available when requesting Observations.

This way, it could solve my issue without being too specific about it (and be more in phase with standard vision, @mjacoby).

What do you think?

hylkevds commented 6 years ago

What would happen to this time instant when the entity is updated?

abourdon commented 6 years ago

@hylkevds, in this case, the createdAt attribute will remain unchanged. We could think about a secondary attribute, updatedAt or lastUpdatedAt (for instance) that will contain the last time entity has been updated.

And as we talk about the term entity, we could also generalize these createdAt and lastUpdatedAt attributes to be part of all entites, as they are not only restricted to Observations.

mjacoby commented 6 years ago

@abourdon I get your point and without deeper analysis adding the property createdAt and updatedAt looks like a viable option. The only real question to me is whether this is such an essential functionality that it should be included into the standard or we try to keep the standard as simple and minimalistic as possible... There are lots of viable application scenarios where these properties are not needed and in this cases we "pollute" each response to a GET of an entity with two additional fields. If we keep adding additional properties to an entity/entities in the future, this will become more and more of a problem, i.e. slowing down performance and making the standard complex.

Therefore, I would propose to change the standard to explicitly focus on extensibility. This could be done either by allowing additional fields in JSON instead of failing (what is currently the defined behaviour) or by adding an attributes properties of a list of key-value-pairs to all entities /like our FROST Server implementation does).

This way it would be possible to easily adjust an existing server implementation, for example to provide the additional properties you suggested.

abourdon commented 6 years ago

@mjacoby I totally agree to keep the core model as simple as possible.

And to go further with your idea of extension, we could imagine to define a set of optional model properties that a server implementation could allow following a specific request option or specific server configuration. If an optional model property make sense to be available whatever the server implementation, we could include it to the SensorThings API specification to restrict any server implementation to follow the same naming convention.

As a direct application, we could imagine our two createdAt and updatedAt as an optional model properties that could be activated when the a specific request option or a server configuration is set. As these optional model properties are generic enough and very close to what the definition of persisted entity is, we could imagine to add them to the SensorThings API specification.

For instance, based on FROST-Server configuration, it could be defined like this:

<context-param>
  <description>Display additional entity's timestamps as creation and update time instants. </description>
  <param-name>entity.EnableTimestamps</param-name>
  <param-value>true</param-value>
</context-param>

This way, any time we get a SensorThings's entity, we will get its associated createdAt and updatedAt attributes (defined respectively as @iot.createdAt and @iot.updatedAt properties for instance):

{
  "@iot.id": 1,
  "@iot.selfLink": "http://example.org/v1.0/Things(1)",
  "@iot.createdAt": "2018-03-17T07:00:00+01:00",
  "@iot.updatedAt": "2018-04-19T08:00:00+01:00",
  "Locations@iot.navigationLink": "Things(1)/Locations",
  "Datastreams@iot.navigationLink": "Things(1)/Datastreams",
  "HistoricalLocations@iot.navigationLink":
  "Things(1)/HistoricalLocations",
  "name": "Oven",
  "description": "This thing is an oven.",
  "properties": {
    "owner": "Noah Liang",
    "color": "Black"
  }
}

What do you think?

mjacoby commented 6 years ago

That would be one possible way of doing it. However, it would be quite unflexible as this would only allow to add properties that are explicitly mentioned in the standard. I'm rather envisioning having something like the OData meta data document, i.e. a file that is describing the information mode used in a concrete server instance. This way it would be more open. Additionally, there could be something like well-known models which could either be part of the standard or some best practice document that is more is to extent.

However, these are quite ground-breaking ideas that would introduce lots of other problems. Therefore my current proposal for the standard would be to just allow additional properties in JSON for any entity and to write down that a server should rather ignore unknown properties instead of failing. This is very straight forward to implement and opens up the standard to be easily extended to application needs.

liangsteve commented 6 years ago

One approach is to write an OGC Engineering Report or even better, a Best Practice (with conformance classes), to define a standard way to use it.

abourdon commented 6 years ago

@mjacoby maybe I was not clear about my thoughts. I propose to have 2 types of optional properties:

All of these properties can be activated following a dedicated configuration (from a configuration file or request option or whatever).

mjacoby commented 6 years ago

Ok, so we are basically on the same side about the functionality. Open questions are

KathiSchleidt commented 5 years ago

Short question for clarification - how does createdAt differ from the resultTime?

liangsteve commented 5 years ago

@KathiSchleidt Here is the definition of resultTime according to 7.2.2.3 of OGC 10-004r3 Observation and Measurement - The attribute resultTime:TM_Instant shall describe the time when the result became available, typically when the procedure (7.2.2.10) associated with the observation was completed. For some observations, this is identical to the phenomenonTime. However, there are important cases where they differ.

KathiSchleidt commented 5 years ago

@liangsteve This was why I was thinking in this direction, as the act of the sensor providing the measurement to STA sounds a lot like "the time when the result became available"; while the result was formally physically "available" on the sensor earlier, this availability is constrained enough to be useless. As we're working with abstractions at various levels (i.e. the Sensors not reporting the number of mV measured, but the ObservedProperty that this value acts as a proxy for on a specific sensor), providing the time that the result is made publicly available under resultTime would make sense to me (using SOS this is often the time of publication)

liangsteve commented 5 years ago

@KathiSchleidt STA and SOS follow O&M. As a result, resultTime SHALL follow the definition of mentioned above.

The time an Observation is made publicly available (assuming via STA?) means differently from resultTime as defined in OGC 10-004r3. As a result, The time an Observation is made publicly available (assuming via STA?) should be using a different attribute.

hylkevds commented 5 years ago

There are also use cases where there is a significant difference between the resultTime and the createAt time. Essentially anything that involves laboratory work.

  1. The phenomenonTime is when the water sample was taken, using a boat in the middle of the lake.
  2. The resultTime is when the lab was finished with the specific analysis. The lab worker than gathers all the results for the analysis order, and posts it, in in a big envelope, to the Umweltbundesambt.
  3. There an overworked worker leaves it at his or her desk for two weeks, before finally typing it in, resulting in the createdAt time.

With a bit of luck we might even see this in the BRGM data set :)

KathiSchleidt commented 5 years ago

so createAt could be understood as the digital object lifespan - would make sense to add while we're on the topic of observation times, what about times of validation steps? I know that values go through different validation levels at Umweltbundesamt before they're finally accepted as the governmental truth. However, this changes the digital object, so we'd need something the entire set of data curation times, maybe look at 19115?

liangsteve commented 5 years ago

We use different Datastreams for data go through different validation levels. Raw data should always be preserved. Sensors (Procedures in O&M) are used for describing the "validation" procedure. We believe this way not only makes sense, but also compliant to STA and O&M.