justintv / Twitch-API

A home for details about our API
www.twitch.tv
1.72k stars 381 forks source link

JSON timestamps returned by Twitch API sometimes contain fractional seconds #627

Closed jfietkau closed 4 years ago

jfietkau commented 7 years ago

Description

Timestamps issued by the Twitch API are used for events such as account registration dates, stream starting times, or follow dates. Within the JSON format returned by API calls, they are ISO 8601 compliant.

Example:

"created_at":"2016-10-18T10:56:44Z"

On most API calls, timestamps are returned without fractional seconds, like in the example above. It makes intuitive sense to me that this level of precision should be suitable for Twitch API clients.

On a small subset of API calls, the timestamps will, however, include microseconds. Example:

"created_at":"2016-10-18T10:56:44.448537Z"

As most client-side libraries will read missing microseconds as zero, these two different representations of the same internal timestamp are highly likely to be intepreted as distinct timestamps on the client side.

It appears that the timestamp format is consistent per API call, i.e. for any given JSON document returned by the API, either all timestamps will contain fractional seconds or none of them will.

Comparing API call results containing microseconds to ones for the same event containing no microseconds, it can be observed that the timestamps are cropped, not rounded, e.g. "2016-10-14T20:05:09.884523Z" turns into "2016-10-14T20:05:09Z".

Only a small fraction of API call results have timestamps containing microseconds. In my test environment (see "How to reproduce") it took until the 50th API call in a row before I got a result that exhibited the issue. The actual average may be even lower.

The first documented instance of microsecond timestamps returned to my client software happened on 2016-10-13 at 23:29 UTC. My client immediately started showing issues resulting from the inconsistent timestamps, thus I strongly believe that the issue only started occurring at around that time and not much earlier.

Expected behavior

Timestamps for the same event retrieved via the Twitch API should be identical between independent API calls.

More broadly speaking, any deterministic API call result, with the underlying data being unchanged, should be identical between independent API calls.

Actual behavior

Timestamps are returned by the API at two different levels of precision.

Because this can lead to differing timestamps for the same event, client applications may exhibit erratic behavior if they assume that a timestamp for the same event never changes between API calls.

How to reproduce

The easiest way to see the problem is to set up a simple API call in a loop and store the results for later comparison. Here's how I did it for the purpose of this issue report:

#!/bin/bash
while true; do
  curl -H "Client-ID: [client ID here]" https://api.twitch.tv/kraken/channels/twitch/follows?limit=1 > `date -Iseconds`.txt
  sleep 300
done

The above shell script retrieves the newest follower to the "Twitch" channel every five minutes and saves the API result to a text file named with the current system time. These files can then be examined by hand. After some time, a result containing microsecond timestamps should be among them.

Mitigation of client-side issues

Even though it took me several days to diagnose the root cause, the client-side fix for this issue was very simple. Here is the one line of Python code that I added to my client:

    def json_datetime_to_naive_utc(self, timestamp):
        ts = dateutil.parser.parse(timestamp)
        ts = ts.astimezone(pytz.timezone('UTC'))
        ts = ts.replace(tzinfo = None)
+       ts = ts.replace(microsecond = 0)
        return ts

This has resolved all issues on my end. However, I believe that this is a problem with the API that, in the long term, should be fixed server side.