mithrandie / csvq

SQL-like query language for csv
https://mithrandie.github.io/csvq
MIT License
1.5k stars 65 forks source link

DATETIME reports incorrect time zone when casting string to datetime. #16

Closed derekmahar closed 4 years ago

derekmahar commented 4 years ago

csvq 1.11.6 date function DATETIME reports the incorrect time zone "-04:00" instead of "-05:00" when casting string "2016-04-14 09:03:00" to a datetime:

$ date --iso-8601=seconds
2019-11-13T00:02:35-05:00
$ date '+%z'
-0500
$ csvq "SELECT DATETIME('2016-04-14 09:03:00') AS DateTime"
+----------------------------+
|          DateTime          |
+----------------------------+
| 2016-04-14T09:03:00-04:00  |
+----------------------------+
$ csvq --datetime-format "%Y-%m-%d %H:%i" "SELECT DATETIME('2016-04-14 09:03') AS DateTime"
+----------------------------+
|          DateTime          |
+----------------------------+
| 2016-04-14T09:03:00-04:00  |
+----------------------------+ 

However, date function NOW reports the correct time zone:

$ csvq "SELECT NOW() AS Now"
+-----------------------------------+
|                Now                |
+-----------------------------------+
| 2019-11-13T00:09:01.273649-05:00  |
+-----------------------------------+

DATETIME does not seem to recognize that during Daylight Saving Time (DST), the time zone in Montreal is Eastern Daylight Time (EDT) or "-04:00", but outside DST, the time zone is Eastern Standard Time (EST) or "-05:00".

Note that I tested this on Ubuntu 18.04.3 LTS in Windows Subsystem for Linux on Windows 10 and Ubuntu Server 18.04.3 LTS in a virtual machine in VirtualBox:

$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
mithrandie commented 4 years ago

Maybe the behavior is a general implementation of date and time. The date command on Ubuntu will give you the following results.

$ date "+%Z"
EST
$ date "+%z"
-0500
$ date --iso-8601=seconds
2019-11-13T00:47:41-05:00
$ date --iso-8601=seconds --date "1 month ago"
2019-10-13T00:48:18-04:00

2016-04-14T09:03:00-04:00 and 2016-04-14T09:03:00-05:00 represent different times on the system, so it is difficult to implement general-purpose conversion logic.

If you are using Ubuntu, you may also be able to use timezone abbreviations.

$ csvq
/home/mithrandie > SET @@TIMEZONE TO "America/New_York"
/home/mithrandie > SELECT DATETIME_FORMAT('2016-04-14 09:03:00', '%Y-%m-%d %H:%i:%s %z') AS Datetime
+--------------------------+
|         Datetime         |
+--------------------------+
| 2016-04-14 09:03:00 EDT  |
+--------------------------+
/home/mithrandie > SELECT DATETIME_FORMAT('2019-11-14 09:03:00', '%Y-%m-%d %H:%i:%s %z') AS Datetime
+--------------------------+
|         Datetime         |
+--------------------------+
| 2019-11-14 09:03:00 EST  |
+--------------------------+
derekmahar commented 4 years ago

I apologize. I was mistaken. DATETIME is behaving correctly. I should not have expected the time zone in Montreal (same as New York) to be the same in April as in November because in April, Daylight Saving Time is active and so the time zone is Eastern Daylight Time (EDT) or "-04:00", but in November, DST is not active and so the time zone is Eastern Standard Time (EST) or "-05:00".

Note that the GNU date command does generate the same result as DATETIME:

$ date --date="2016-04-14 09:03:00" --iso-8601=seconds
2016-04-14T09:03:00-04:00
$ csvq "SELECT DATETIME('2016-04-14 09:03:00') AS DateTime"
+----------------------------+
|          DateTime          |
+----------------------------+
| 2016-04-14T09:03:00-04:00  |
+----------------------------+
$ date --iso-8601=seconds
2019-11-13T12:15:55-05:00
derek@DESKTOP-2F2F59O:~$ csvq "SELECT DATETIME('2019-11-13 12:15:55') AS DateTime"
+----------------------------+
|          DateTime          |
+----------------------------+
| 2019-11-13T12:15:55-05:00  |
+----------------------------+