msgpack / msgpack

MessagePack is an extremely efficient object serialization library. It's like JSON, but very fast and small.
http://msgpack.org/
7.01k stars 521 forks source link

Timekeeping is hard, part I: MessagePack does not understand leap seconds #240

Open ghost opened 6 years ago

ghost commented 6 years ago

The MessagePack specification currently mentions the following:

  • Timestamp 32 format can represent a timestamp in [1970-01-01 00:00:00 UTC, 2106-02-07 06:28:16 UTC) range. Nanoseconds part is 0.
  • Timestamp 64 format can represent a timestamp in [1970-01-01 00:00:00.000000000 UTC, 2514-05-30 01:53:04.000000000 UTC) range.
  • Timestamp 96 format can represent a timestamp in [-584554047284-02-23 16:59:44 UTC, 584554051223-11-09 07:00:16.000000000 UTC) range.

From the time range given for the Timestamp 32 format, it is obvious that leap seconds were not considered at all (2^32 seconds = 33 normal 4-year cycles + 1 leap day free 4-cycle + P37DT06H28M16S). I expect that the other ranges are equally inaccurate. This can't be fixed by simply changing the timestamp range. For instance, what are the correct Timestamp 32 values for the following instants (denoted according to RFC 3339)?

  1. 1972-06-30T23:59:59Z
  2. 1972-06-30T23:59:60Z
  3. 1972-07-01T00:00:00Z

From this example, which is the first occurrence of a leap second in UTC, it should be obvious that every instant after 1972-06-30T23:59:59Z is ambiguous, with two possible interpretations that are (for recent timestamps) almost twenty seconds apart.

As a resolution, I suggest specifying timestamps in reference to a simpler timescale, such as TAI (International Atomic Time), which is what UTC is based on. This should also solve the related problem that timestamps prior to 1972-01-01 UTC are ambiguous due to common ignorance of the definition of UTC.

dchenk commented 6 years ago

@rhymoid do you think this ambiguity would be solved by specifying that seconds and nanoseconds count from the Epoch, as defined in Unix (POSIX) time?

I think actually your concern can be solved two different ways, without complicating much; either

Edit: The second bullet point above suggests that a historical leap seconds table would be necessary for converting a Unix timestamp to a timestamp in a civil calendar; that would be a sad design. My PR #245 clarifies this point.

methane commented 6 years ago

I think Leap Smear is common strategy to handle leap seconds.

TAI is not so common as unix timestamp based on UTC. Please note msgpack is cross-language format like JSON. Accuracy only on limited language/environment is not goal of us. You can use more special format for it.

So I prefer keeping current unixtime based spec.