Open gumdropsteve opened 4 years ago
This issue relates to TIMESTAMP expecting float values (aka "millisecond resolution").
Where I was passing in;
| tpep_pickup_datetime
-- | --
2015-01-15 19:05:39
2015-01-10 20:33:38
2015-01-10 20:33:39
which specifies the times down to the second, if I instead pass in;
| tpep_pickup_datetime
-- | --
2015-01-15 19:05:39.0
2015-01-10 20:33:38.0
2015-01-10 20:33:39.0
which specifies the times down to the millisecond, CASTing to TIMESTAMP works as expected.
This can be achieved by having default values (in your file / table) that specify down to the millisecond via decimal or by CONCATing ||
decimals to your values before CASTing. Here's an example concatenating a simple decimal || '.'
and a decimal with minimal value impact || '.0001'
before CASTing.
NOTE: https://gist.github.com/gumdropsteve/48af7ec54c9e7cf52ded504e2f2a2dfb#gistcomment-3152873
A related issue is that if you try to cast a string such as 2020-02-18
to TIMESTAMP it will also return 1970-01-01
. To get around that you can do
CAST( CAST column AS DATE) AS TIMESTAMP)
This will be addressed once the cudf timestamp and duration datatypes are more thoroughly implemented and also when TO_DATE is implemented in SQL
Describe the bug All values CAST to TIMESTAMP return as 1 January 1970.
Steps/Code to reproduce bug Here's a Jupyter Notebook that reproduces the issue: https://gist.github.com/gumdropsteve/48af7ec54c9e7cf52ded504e2f2a2dfb
Expected behavior Values to return accurately.
Environment overview (please complete the following information)
Environment details
Click here to see environment details
Additional context Notebook that reproduces the issue tests on columns of the following dtypes: str, int, float, varchar