getredash / redash

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
http://redash.io/
BSD 2-Clause "Simplified" License
26.45k stars 4.38k forks source link

Impala and Hive connection issues #2986

Open sabmalik opened 6 years ago

sabmalik commented 6 years ago

Issue Summary

With varying libraries in redash5 on ubuntu, I could only get one of the connections to work at a time. With my limited python knowledge, I debugged the code and figured out the combination that works.

Scenarios thrift = 0.11.0, pyhive = 0.3.0 Hive: Connection option available and connects. Impala: Connection option available but connection test says "TypeError: expecting list of size 2 for struct args".

thrift = 0.9.3, pyhive = 0.3.0 Hive: Connection option disappears and trying "from pyhive import hive" gives you "ImportError: cannot import name TFrozenDict" Impala: Connection option available and connects

thrift = 0.9.3, pyhive = 0.2.1 Hive: Connection option available and connects Impala: Connection option available and connects

_Note that the requirements_allds.txt file dictates pyhive to be 0.3.0 so I am not sure if this impacts some other connection type

Steps to Reproduce

  1. Create a connection for Impala/hive
  2. Test connection

Technical details:

BigFF commented 6 years ago

I am using impala and find the same problem I found there are some similar issues before https://github.com/getredash/redash/pull/2410 https://github.com/getredash/redash/issues/1969

to solve impala we should change thrift>=0.8.0 to thrift==0.9.3 or maybe upgrade to pipenv to solve the env

I have no idea about pyhive but I think its the same problem we should keep the version of package?

sabmalik commented 6 years ago

Yes, I found that only that combination of the library versions works for both Impala and Hive so if someone else can confirm this with the latest version of redash, it should be locked down in the requirements.

justmiles commented 5 years ago

Just tested this successfully. Setting thrift==0.9.3 and pyhive = 0.2.1 on top of v7.0.0 {4a978ba}. I can query CDH 5.11.0 Hive, CDH 5.11.0 Impala, and EMR 5.21.0 Hive

funyeah commented 5 years ago

+1