apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.48k stars 1.01k forks source link

The return type of EXTRACT #1992

Open liukun4515 opened 2 years ago

liukun4515 commented 2 years ago

I think the result type of u32 is not enough. From the spark,

spark-sql> SELECT EXTRACT(YEAR FROM TIMESTAMP '-2000-12-16 12:21:13');
-2000

But this is not related to this issue, I will open a follow up pull request and discuss the returned type.

Originally posted by @liukun4515 in https://github.com/apache/arrow-datafusion/pull/1991#r824743599

Ted-Jiang commented 2 years ago

Do we need support negative timestamp or date?

Ted-Jiang commented 2 years ago

PG not support negative timestamp

postgres@33a4a5d969c0:/$ psql
psql (13.5 (Debian 13.5-1.pgdg110+1))
Type "help" for help.

postgres=# SELECT EXTRACT(YEAR  FROM TIMESTAMP '-2001-02-16 20:38:40');
ERROR:  time zone displacement out of range: "-2001-02-16 20:38:40"
LINE 1: SELECT EXTRACT(YEAR  FROM TIMESTAMP '-2001-02-16 20:38:40');
xudong963 commented 2 years ago

I don't think we need to support negative timestamp.

liukun4515 commented 2 years ago

because i hit this document from PG https://www.postgresql.org/docs/14/functions-datetime.html and it points out that "The extract function returns values of type numeric. The following are valid field names"

other point from the definition

extract ( field from timestamp ) → numeric

Get timestamp subfield; see [Section 9.9.1](https://www.postgresql.org/docs/14/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT)

extract(hour from timestamp '2001-02-16 20:38:40') → 20
liukun4515 commented 2 years ago

now we can leave this issue and keep discussion it.

waitingkuo commented 1 year ago

@liukun4515 should we change the return to decimal/numeric type?

waitingkuo commented 1 year ago

just found that date_part has different return type in pg

date_part ( text, timestamp ) → double precision

    Get timestamp subfield (equivalent to extract); see [Section 9.9.1](https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT)
    date_part('hour', timestamp '2001-02-16 20:38:40') → 20
waitingkuo commented 1 year ago

from postgresql's doc https://www.postgresql.org/docs/current/functions-datetime.html

The extract function is primarily intended for computational processing. For formatting date/time values for display, see Section 9.8.

The date_part function is modeled on the traditional Ingres equivalent to the SQL-standard function extract:

date_part('field', source) Note that here the field parameter needs to be a string value, not a name. The valid field names for date_part are the same as for extract. For historical reasons, the date_part function returns values of type double precision. This can result in a loss of precision in certain uses. Using extract is recommended instead.

jhorstmann commented 1 year ago

PG not support negative timestamp

This sounded surprising, it seems what is not supported is the input syntax starting with a minus sign. Negative timestamps are supported, just with a different syntax:

SELECT '2022-10-01 BC'::timestamp;
-002021-10-01T00:00:00.000Z
SELECT EXTRACT(YEAR FROM '2022-10-01 BC'::timestamp);
-2022