selectel / tempo

NIF-based date and time parsing and formatting for Erlang
MIT License
67 stars 10 forks source link

Tempo ignores UTC offset #12

Open alexdruzhilov opened 8 years ago

alexdruzhilov commented 8 years ago

strptime and strftime supports '%z' field descriptor, but:

tempo:strptime(<<"%z">>, <<"+03:00">>). {ok, 0.0}

tempo:strptime(<<"%H:%M:%S%z">>, <<"11:22:33+03:00">>). {ok,40953.0} tempo:strptime(<<"%H:%M:%S">>, <<"11:22:33">>). {ok,40953.0}

Is it possible to expose timezone/UTC offset data?

superbobry commented 8 years ago

IIRC there were some cross-platform compatibility issues related to %z. I'll have a look into this over the weekend.

alexdruzhilov commented 8 years ago

I use Ubuntu and this code works for me:

#define _XOPEN_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

int main(void)
{
   struct tm tm;
   char buf[255];

   memset(&tm, 0, sizeof(struct tm));

   strptime("-03:00", "%z", &tm);
   strftime(buf, sizeof(buf), "%z", &tm);

   puts(buf);
   exit(EXIT_SUCCESS);
}

-0300

superbobry commented 8 years ago

tl;dr I think ":" is the problem.

Fact 1: ISO8601 allows UTC offset to be formatted with ":".

Fact 2: strptime Linux man page claims to implement %z in a way compatible with ISO8601. However this is different on OS X

 %z    is replaced by the time zone offset from UTC; [...] hours and minutes follow 
       with two digits each **and no delimiter between them**.

Fact 3: on OS X the call to strptime in your C snippet returns NULL, which indicates an error.

If strptime() fails to match all of the format string and therefore an error occurred, the function returns NULL.

Otherwise, the %z in tempo seems to work fine.

1> tempo:strptime(<<"%z">>, <<"+0300">>).  % My TZ.
{ok,0.0}
2> tempo:strptime(<<"%z">>, <<"+0400">>).
{ok,-3.6e3}

My guess is that +0300 is your local time zone, that's why parsing <<"+03:00">> yields 0.

alexdruzhilov commented 8 years ago

+03:00 is not my timezone.

tempo:strptime(<<"%z">>, <<"+0300">>).
{ok,0.0}

tempo:strptime(<<"%z">>, <<"+0400">>).
{ok,0.0}

Also I tried timezones with and without ":" and result is the same: strptime shows me proper timezone, but tempo is not. I tried to find a root of the problem and rewrite code on c. My thoughts: 1) In linux timezone stored in structure tm inside __tm_gmtoff field 2) tempo.c tries to convert tm structure in unix timestamp with function timegm https://github.com/selectel/tempo/blob/master/c_src/tempo.c#L157 3) timegm ignores timezone and moreover sets tm.__tm_gmtoff to zero

From my point of view, it's worth to save all data from tm structure and expose it in erlang instead of convertion to unix timestamp (because unix timestamp not worried about timezones). Any way in erlang it's easy to convert tm to any possible timestamp.

superbobry commented 8 years ago

The libc version used in BSD and OS X doesn't work with the nonstandard tm_gmtoff field and instead directly subtracts the offset from tm_hour and tm_min. Linux does exactly the opposite. And I haven't even looked at Windows (which we don't target currently anyway). With this in mind, I'd rather not expose tm struct to the user and leave it internal to tempo.

As for the original issue, I'm not yet sure how to best address it. If you have any ideas, please feel free to share them.