cloudera / impyla

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Apache License 2.0
730 stars 248 forks source link

Impala DATE is retrieved as str and not datetime.date #410

Closed neilgluent closed 4 years ago

neilgluent commented 4 years ago

I have a table with both TIMESTAMP and DATE columns, Impala 3.4 which has DATE available:

describe test.impala_types;
+-----------+---------------+---------+
| name      | type          | comment |
+-----------+---------------+---------+
...
| column_6  | timestamp     |         |
...
| column_11 | date          |         |
+-----------+---------------+---------+

select column_6, column_11 from test.impala_types limit 1;
+---------------------+------------+
| column_6            | column_11  |
+---------------------+------------+
| 2020-11-09 01:40:06 | 2000-01-30 |
+---------------------+------------+

When I query the TIMESTAMP column via impyla I get a datetime:

cursor = conn.cursor()
print 'column_6'
sql = 'select column_6 from test.impala_types limit 1'
cursor.execute(sql)
row = cursor.fetchone()
print 'row:', row[0], type(row[0])
cursor.close()

...

column_6
row: 2020-11-09 01:40:06 <type 'datetime.datetime'>

When I query the DATE column I get a string:

cursor = conn.cursor()
print 'column_11'
sql = 'select column_11 from test.impala_types limit 1'
cursor.execute(sql)
row = cursor.fetchone()
print 'row:', row[0], type(row[0])
cursor.close()

...

column_11
row: 2000-01-30 <type 'str'>

Ideally we should get either datetime.date or datetime.datetime.

Impyla version:

Name: impyla
Version: 0.16.2
Summary: Python client for the Impala distributed query engine
Home-page: https://github.com/cloudera/impyla
Author: Uri Laserson
timarmstrong commented 4 years ago

Thanks for the bug report! Agree this is an issue - impyla is falling back to string as a default data type but that's not quite right for DATE.