cloudera / impyla

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Apache License 2.0
725 stars 247 forks source link

Hiveserver2 + Kerberos + HTTP transport #365

Open dignajar opened 4 years ago

dignajar commented 4 years ago

I'm trying to connect via Python3 to my Hiveserver2, running in the transport layer HTTP (port 10001), also I have Kerberos.

impyla==0.16a2
sasl==0.2.1
thrift_sasl==0.2.1
thriftpy==0.3.9
thriftpy2==0.4.0
from impala.dbapi import connect
conn = connect(host='<server>', port=10001,auth_mechanism='GSSAPI',kerberos_service_name='hive', use_http_transport=True)
cursor = conn.cursor()
cursor.execute('SHOW DATABASES;')
results = cursor.fetchall()
print(results)
Traceback (most recent call last):
  File "test.py", line 2, in <module>
    conn = connect(host='<server>', port=10001,auth_mechanism='GSSAPI',kerberos_service_name='hive', use_http_transport=True)
  File "/usr/local/lib/python3.6/site-packages/impala/dbapi.py", line 150, in connect
    http_path=http_path)
  File "/usr/local/lib/python3.6/site-packages/impala/hiveserver2.py", line 820, in connect
    transport.open()
  File "/usr/local/lib/python3.6/site-packages/thrift/transport/THttpClient.py", line 73, in open
    self.__http = http.client.HTTP(self.host, self.port)
AttributeError: module 'http.client' has no attribute 'HTTP'

Looks like the version from Github supports http as a transport layer.

cravani commented 4 years ago

@dignajar Looking at code, Kerberos authentication with HTTP Transport mode is not yet supported.

Ref: https://github.com/cloudera/impyla/blob/v0.16a2/impala/hiveserver2.py#L791