Azure / opendigitaltwins-dtdl

Digital Twins Definition Language
Creative Commons Attribution 4.0 International
473 stars 161 forks source link

[DTDLv3] Restructure Telemetry as a Semantic Type and introduce a generic message/event as a basic type #155

Open thern743 opened 1 year ago

thern743 commented 1 year ago

With the release of extensions in DTDLv3, I'd like to start a discussion about something that's bothered me and never made sense about DTDL from the start:

Telemetry should not be a basic element type. Telemetry should be a Semantic Type that extends a broader expression around device messaging.

Commands are an abstraction that express something sending specific messages to a device (C2D messaging in Azure IoT parlance). Likewise, Telemetry is an abstraction expressing devices sending messages to something else (D2C messaging in Azure IoT parlance). Properties are an abstraction for bi-directional messaging that encapsulate state.

The concept of Telemetry has implied semantics: it's generally numeric time-series data that has no QoS guarantees (delivery or durability).

This is as opposed to other types of messages like a notification or an event.

We can see these abstractions leaking through with the introduction of the Historization and Streaming extensions which, btw, feel somewhat forced and unnatural due to this fundamental "flaw" in the core specification.

It makes much more sense that a basic element (capability) type around some kind of "message" be defined in the specification. From this, multiple Semantic Types can define one being Telemetry, etc. I propose Event or Notification as the basic type: not all events/notifications are telemetry but all telemetry could be considered events/notifications. Message could be considered but it may be too generic and confusing as to the direction of the messaging.

Another discussion could be had about where this puts Commands however I do think Commands are more fundamental to device behavior.

I understand that DTDLv3 might not be the revision to consider making these changes, likely because it affects backwards compatibility. I do think it's worth a conversation with the restructuring of Semantic Types and extensions being considered.

rido-min commented 1 year ago

@briancr-ms might have some insights

jrdouceur commented 1 year ago

Is this just an issue of terminology? In other words, if Telemetry had been called something else, would this issue have been raised? My reaction to the issue text above is that the concern stems from an inference in what was intended by the DTDL term Telemetry: "The concept of Telemetry has implied semantics: it's generally numeric time-series data that has no QoS guarantees". FWIW, this is not what I had inferred from the term. I had inferred only that it meant dynamic data sent autonomously from a device, which would include notifications, events, alerts, or various other data including but not limited to time series.

thern743 commented 1 year ago

When you say "terminology" that is exactly what is meant by "semantics", as in a semantic type definition. So yes, it is a matter of "terminology" (semantics).

I cannot find any definition of telemetry that doesn't carry some semantics with it around streaming, measurements, instrumentation, values, etc. A generic message, which may be text or any object, by definition may not be telemetry. Event data is not necessarily telemetry either. Nor can I find any other modeling spec or language for devices that doesn't differentiate between messages, events, and telemetry.

Regardless, there should be a way to differentiate messages that can and cannot be dropped or considered approximations due to timeliness. Ex: temperature telemetry data streamed at X interval that may drop some values without side effects, vs a single, critical lifecycle event that must have some kind of delivery guarantee to prevent side effects. I.e., something that occurred vs a measurement at a point in time.

I suspect most people would agree with me here. Whether they care enough to update the modeling spec is a different story.

jrdouceur commented 1 year ago

I suppose we could argue about the choice of term, although one of the first hits I found from a quick search seems to suggest a broader meaning:

Definition of Telemetry

Telemetry is the automatic recording and transmission of data from remote or inaccessible sources to an IT system in a different location for monitoring and analysis.

However, since changing the term at this point would be at least a bit problematic from a backward-compatibility perspective, I think the more critical aspect of this issue is what semantics are implied and how we could offer ways to configure narrower semantics for specific applications, as you suggested WRT reliability, timeliness, and ordering. I think this is certainly an issue worth deeper consideration.

thern743 commented 1 year ago

That definition you linked speaks of telemetry in terms of recording, sensors, measurements, etc.

I'm suggesting a forward-compatible synonym for the existing Telemetry element type. I.e., DTDLv3 models may use Telemetry or Event/Message/Notification/whatever we choose. It would come with a caveat that future versions of DTDL would depreciate Telemetry as an element type and moved to an adjunct type. From there we can discuss the semantics around QoS, quantitative/unit requirements, etc. for different types.

thern743 commented 1 year ago

Referencing #43