Closed dacjames closed 9 years ago
Thanks for reporting this. I can reproduce the issue as well. I'll update soon.
The Impala shell uses beeswax, while the default for impyla
is HiveServer2. This is probably a problem with the server-side implementation of HiveServer2, which I'll look into more and/or file a Jira.
As a temporary workaround, if you're willing to use impyla
's latest git master
, then you can choose whether to use HiveServer2 or beeswax, e.g.,
conn = connect(host='impalad.host', port=21000, protocol='beeswax')
Note that the port must be set to the beeswax service (same port as the impala shell). Also note that beeswax is (currently) considerably faster than HiveServer2. However, it hasn't been tested as much, and beeswax is not as feature-rich as HS2, so it's possible that some things might break. Either way, that may be a suitable solution for the moment.
Tracking this here: https://issues.cloudera.org/browse/IMPALA-1330
Thank you, @laserson. The workaround you suggested works for now and we'll follow the impala jira for a more permanent fix.
I fixed this for Impala 2.0, which will be available pretty soon.
Thanks, @henryr! I'll close this issue.
My team has noticed a bug in processing
SHOW PARTITIONS
queries.When executing the query in impala-shell, these queries work as expected:
However, executing the same query through the impyla API, the partition columns (year, month, day) all come back as
None
.From digging through the code, it appears that the data is already broken when coming back in the Thrift response so I have been unable to fix this issue myself. This is a problem for our team because we have not been able to figure out how to reliably determine what partitions have been added to Impala.
We see the same behavior running
SHOW PARTITIONS
on any of our tables but for reference the Impala Table in this example looks like (sensative names removed):EDIT: I have tried both the release 0.8.1 version and the latest git master.