pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.4k stars 5.85k forks source link

Support TIMESTAMP WITH TIMEZONE and without Y2K38 limit #20522

Open ghost opened 4 years ago

ghost commented 4 years ago

Feature Request

Is your feature request related to a problem? Please describe:

This is forked from https://github.com/pingcap/tidb/issues/20348

Describe the feature you'd like:

MySQL has both TIMESTAMP and DATETIME datatypes, which have different behaviors regarding how to handle timezones. I prefer TIMESTAMP, but it has a limitation that the range of values is limited to 32-bit (i.e. it will not be able to handle dates past 2038).

The SQL standard has a type TIMESTAMP WITH TIMEZONE. My request is to create a new data type which:

Describe alternatives you've considered:

The alternative proposal would be to also fix TIMESTAMP to not have a Y2K38 limit, but this may cause compatibility issues with MySQL such as migrations back (although it could be protected via a flag).

Teachability, Documentation, Adoption, Migration Strategy:

It is fairly easy to teach because it is part of the SQL standard. Eventually TIMESTAMP will need to be improved, deprecated, fixed etc. As we get closer to Y2K38 storing as TIMESTAMP will become a similar issue to storing currency in FLOAT columns.

zz-jason commented 4 years ago

The feature request LGTM. But we need to discuss more the priority of this feature request and the improving methods.

dveeden commented 3 years ago

I think TIMESTAMP WITH TIMEZONE would be useful. However as MySQL doesn't have this yet it is difficult to do this in a compatible way. We could look at timestamptz in PostgreSQL and as mentioned before the SQL standard.

However there is an interesting note on this in the PostgreSQL docs: "The SQL standard requires that writing just timestamp be equivalent to timestamp without time zone, and PostgreSQL honors that behavior. timestamptz is accepted as an abbreviation for timestamp with time zone; this is a PostgreSQL extension."

And looks like MySQL and PostgreSQL have different behavior, not sure which follows the standard:

PostgreSQL 14.0:

postgres=# SELECT TIMESTAMP '2021-10-29 10:00:00-07:00';
      timestamp      
---------------------
 2021-10-29 10:00:00
(1 row)

postgres=# SELECT TIMESTAMP WITH TIME ZONE '2021-10-29 10:00:00-07:00';
      timestamptz       
------------------------
 2021-10-29 17:00:00+00
(1 row)

MySQL 8.0.26:

mysql> SELECT TIMESTAMP '2021-10-29 10:00:00-07:00';
+---------------------------------------+
| TIMESTAMP '2021-10-29 10:00:00-07:00' |
+---------------------------------------+
| 2021-10-29 19:00:00                   |
+---------------------------------------+
1 row in set (0.0011 sec)
mysql> SET time_zone='+00:00';
Query OK, 0 rows affected (0.0009 sec)
mysql> SELECT TIMESTAMP '2021-10-29 10:00:00-07:00';
+---------------------------------------+
| TIMESTAMP '2021-10-29 10:00:00-07:00' |
+---------------------------------------+
| 2021-10-29 17:00:00                   |
+---------------------------------------+
1 row in set (0.0008 sec)

And fixing Y2K38 things could be done separately:

diff --git a/expression/builtin_time.go b/expression/builtin_time.go
index dcaac7b85..0c9d7c885 100644
--- a/expression/builtin_time.go
+++ b/expression/builtin_time.go
@@ -1719,9 +1719,6 @@ func evalFromUnixTime(ctx sessionctx.Context, fsp int8, unixTimeStamp *types.MyD
        if err != nil && !terror.ErrorEqual(err, types.ErrTruncated) {
                return res, true, err
        }
-       if integralPart > int64(math.MaxInt32) {
-               return res, true, nil
-       }
        // Split the integral part and fractional part of a decimal timestamp.
        // e.g. for timestamp 12345.678,
        // first get the integral part 12345,

Without this patch:

TiDB> SELECT FROM_UNIXTIME(2147483647);
+---------------------------+
| FROM_UNIXTIME(2147483647) |
+---------------------------+
| 2038-01-19 04:14:07       |
+---------------------------+
1 row in set (0.0124 sec)
TiDB> SELECT FROM_UNIXTIME(2147483648);
+---------------------------+
| FROM_UNIXTIME(2147483648) |
+---------------------------+
| NULL                      |
+---------------------------+
1 row in set (0.0012 sec)

With this patch:

TiDB> SELECT FROM_UNIXTIME(2147483647);
+---------------------------+
| FROM_UNIXTIME(2147483647) |
+---------------------------+
| 2038-01-19 04:14:07       |
+---------------------------+
1 row in set (0.0010 sec)
TiDB> SELECT FROM_UNIXTIME(2147483648);
+---------------------------+
| FROM_UNIXTIME(2147483648) |
+---------------------------+
| 2038-01-19 04:14:08       |
+---------------------------+
1 row in set (0.0013 sec)
kennytm commented 3 years ago

In the SQL standard (ISO 9075-2) the grammar just allows TIMESTAMP 'X'. Both TIMESTAMP WITH TIME ZONE '…' and TIMESTAMPTZ '…' are PostgreSQL extensions.

<timestamp literal> ::=
    TIMESTAMP <timestamp string>
```abnf ::= ::= ::=

In §5.3 <literals>/27 it says

The declared type of a <timestamp literal> that does not specify <time zone interval> is TIMESTAMP(P) WITHOUT TIME ZONE, where P is the number of digits in <seconds fraction>, if specified, and 0 (zero) otherwise. The declared type of a <timestamp literal> that specifies <time zone interval> is TIMESTAMP(P) WITH TIME ZONE, where P is the number of digits in <seconds fraction>, if specified, and 0 (zero) otherwise

Therefore, PostgreSQL is violating the standard here by making TIMESTAMP '2021-10-29 10:00:00-07:00' return a TIMESTAMP WITHOUT TIME ZONE.

postgres> select pg_typeof(TIMESTAMP '2021-10-29 10:00:00-07:00');
timestamp without time zone
dveeden commented 1 month ago

Looks like the Y2K38 issue might have been fixed in #22616 :

sql> SELECT FROM_UNIXTIME(2147483647);
+---------------------------+
| FROM_UNIXTIME(2147483647) |
+---------------------------+
| 2038-01-19 04:14:07       |
+---------------------------+
1 row in set (0.0006 sec)

sql> SELECT FROM_UNIXTIME(2147483648);
+---------------------------+
| FROM_UNIXTIME(2147483648) |
+---------------------------+
| 2038-01-19 04:14:08       |
+---------------------------+
1 row in set (0.0004 sec)

sql> SELECT TIDB_VERSION()\G
*************************** 1. row ***************************
TIDB_VERSION(): Release Version: v8.4.0-alpha-338-g640dc8844c
Edition: Community
Git Commit Hash: 640dc8844cbd0d3ff1fafa64ce213afcc7dd25a4
Git Branch: time_truncate_fractional-sqlmode
UTC Build Time: 2024-10-04 15:21:10
GoVersion: devel go1.24-065c1359 Fri Oct 4 14:49:31 2024 +0000
Race Enabled: false
Check Table Before Drop: false
Store: unistore
1 row in set (0.0006 sec)