mithrandie / csvq

SQL-like query language for csv
https://mithrandie.github.io/csvq
MIT License
1.5k stars 65 forks source link

Datetime function LOCALTIME #43

Closed derekmahar closed 3 years ago

derekmahar commented 3 years ago

Please implement datetime function LOCALTIME that accepts a DATETIME and an optional time zone and translates the DATETIME (which should include a time zone suffix) to the corresponding datetime in the given time zone or in the default time zone if the time zone argument is absent. If the DATETIME does not include a time zone suffix, LOCALTIME would assume the default time zone.

For example, given the table of DATETIME values

+---------------------------+
|       date_and_time       |
+---------------------------+
| 2019-01-17T16:00:00Z      |
| 2019-01-29T03:57:00-05:00 |
+---------------------------+

the query

SELECT LOCALTIME(date_and_time, 'Asia/Tokyo')

would produce the result

+---------------------------+
|       date_and_time       |
+---------------------------+
| 2019-01-18T01:00:00+09:00 |
| 2019-01-29T17:57:00+09:00 |
+---------------------------+

SELECT LOCALTIME(date_and_time) with command line option --timezone Asia/Tokyo or statement SET @@TIMEZONE TO "Asia/Tokyo"; would produce the same result.

Note that SELECT LOCALTIME(date_and_time, 'UTC') would be equivalent to SELECT UTC(date_and_time).

You can verify these datetime conversions using the Unix date command:

$ TZ=Asia/Tokyo date -f %Y-%m-%dT%H:%M:%S%z -j +%Y-%m-%dT%H:%M:%S%z 2019-01-17T16:00:00+0000
2019-01-18T01:00:00+0900
$ TZ=Asia/Tokyo date -f %Y-%m-%dT%H:%M:%S%z -j +%Y-%m-%dT%H:%M:%S%z 2019-01-29T03:57:00-0500
2019-01-29T17:57:00+0900
derekmahar commented 3 years ago

On second thought, the name LOCAL_DATETIME might be more consistent with the names of the other datetime functions.

mithrandie commented 3 years ago

In most cases, such as when you want to compare times in different time zones, you can simply compare in UTC. If you want to get the time in local time, you can get it by converting from Unix time.

$ csvq -z "Asia/Tokyo" "SELECT DATETIME(UNIX_TIME('2019-01-29T03:57:00-05:00'))"
+--------------------------------------------------+
| DATETIME(UNIX_TIME('2019-01-29T03:57:00-05:00')) |
+--------------------------------------------------+
| 2019-01-29T17:57:00+09:00                        |
+--------------------------------------------------+

I can't imagine what situation it would be needed. Can you to tell me specifically what you need to do and for what purpose?

derekmahar commented 3 years ago

Thank you. This solved my time zone conversion problem.

I needed this function in order to convert a column of DATETIME values in UTC to another time zone. I had initially copied these DATETIME values from source data, appended the local time zone, converted them to UTC, and then later decided that I wanted to restore the DATETIME values to their original time zone so that they matched those in the source data for easier comparison. It was easier to run a single conversion query than to copy the original source data.

mithrandie commented 3 years ago

Thank you, maybe I kind of get it.

If implemented, I would prefer to add a second argument to the DATETIME function instead of adding a new function.

DATETIME(value [, timezone])
derekmahar commented 3 years ago

Yes, this is a good idea. I was going to suggest it, actually.

mithrandie commented 3 years ago

This change is included in the version 1.14.0.

$ csvq "SELECT id, date_and_time FROM dt"
+----+---------------------------+
| id |       date_and_time       |
+----+---------------------------+
| 1  | 2021-03-09T12:00:00       |
| 2  | 2021-03-09T12:00:00Z      |
| 3  | 2021-03-09T12:00:00-07:00 |
+----+---------------------------+

$ csvq -timezone "Asia/Tokyo" "SELECT id, DATETIME(date_and_time) FROM dt"
+----+---------------------------+
| id |  DATETIME(date_and_time)  |
+----+---------------------------+
| 1  | 2021-03-09T12:00:00+09:00 |
| 2  | 2021-03-09T12:00:00Z      |
| 3  | 2021-03-09T12:00:00-07:00 |
+----+---------------------------+

$ csvq -timezone "Asia/Tokyo" "SELECT id, DATETIME(date_and_time, 'UTC') FROM dt"
+----+--------------------------------+
| id | DATETIME(date_and_time, 'UTC') |
+----+--------------------------------+
| 1  | 2021-03-09T12:00:00Z           |
| 2  | 2021-03-09T12:00:00Z           |
| 3  | 2021-03-09T19:00:00Z           |
+----+--------------------------------+
derekmahar commented 3 years ago

Excellent! Thank you!