timestamp accuracy - Githubissues

pietercardoen commented 2 years ago

The current sparkplug specification specifies that timestamps have to be representing time since epoch in milliseconds. In our application, we are facing issues regarding accuracy of these timestamps as we have high rate data. In my opinion, timestamp format could be improved.

It would be great to have unlimited accuracy for metric timestamps. I suggest to improve this by:

using current timestamp format (ms since epoch) as default situation
use a counter value which can be converted to utc time instead of metric timestamps
use extra time specification in the node & device birth messages:
- a counter offset in milliseconds since epoch
- a scale factor to indicate time unit (unit = ms * scale factor)
the counter offset default value is 0. The scale factor default value is 1.0. This results timestamps in milliseconds since epoch if unspecified.

A timestamp could then be calculated by: timestamp_ms = (timestamp / scale factor) + offset. As a result of this methodology, messages can be slightly shorter and time accuracy can be virtually unlimited.

What is your opinion on this matter?

I-Campbell commented 2 years ago

I am not sure I like the scale factor. Some implementations might not scale correctly. For example if scale factor is 0.1, is a value of 1billion converted to 100,000,000? or is it converted to 100,000,001.49 due to 32-bit float quantization?

I think timestamps should be in nanoseconds since 1970, similar to the definition of LDATE_AND_TIME in IEC 61131-3

I agree on the offset to cut down on the number of bytes needed to encode. So Payload.timestamp_offset becomes the absolute date and time that the message was first started to be generated (typically, the time the first Metric Value Change was recorded) Payload.timestamp_delta is the difference between the time the message was sent, and Payload.timestamp (in nanoseconds) Payload.Metric.timestamp_delta is the difference between the time the value was recorded, and Payload.timestamp_offset (in nanoseconds)

timestamp_deltas are optional. If they are not included the calculated timestamp is equal to Payload.timestamp_offset

as this is a breaking change, I would save it for a future release.

pietercardoen commented 2 years ago

Thanks for perceiving the idea very well. This is indeed a major change and I understand this could only be included in a future release.

I must say that I'm also not completely convinced by my proposal as well. The reason I came to a scale factor was because it would be benificial to use CPU counters instead of absolute timestamps. This is a major benefit for tiny embedded devices as these devices don't always have a true time source.

I agree that a floating point scale factor is indeed bad practice due to the floating point precision. An alternative to this could be a divider and a multiplier.

I however note a major difference in your proposal versus my proposal: I would use the BIRTH message to define a fixed offset for a specific node and would leave that offset unchanged for the entire session. In your proposal, the offset migth be specified for every message. A good solution might be that it is optional to specify offset and if offset is not specified, the last specified offset must be used.

I-Campbell commented 2 years ago

If the protobufC spec switched to the wellknown types Duration and Timestamp, then the builtin functions in java Protobufs for converting to JSON will be a lot neater

wellknown types: https://developers.google.com/protocol-buffers/docs/reference/google.protobuf

com.google.protobuf.util.JsonFormat.Printer.print(): https://developers.google.com/protocol-buffers/docs/reference/java

eclipse-sparkplug / sparkplug

timestamp accuracy #232