cloudera / impyla

Python DB API 2.0 client for Impala and Hive (HiveServer2 protocol)
Apache License 2.0
727 stars 248 forks source link

Impersonation in kerberos session #281

Open venkattaku opened 6 years ago

venkattaku commented 6 years ago

I'm interfacing a Hive cluster that is secured in Kerberos. I'm able to query the cluster with kerberos authentication mechanism. Now I want to impersonate a proxy user for my queries using impyla. Gone through source code and realized there is no support for impersonation.

puzon0 commented 6 years ago

To impersonate another user one can pass configuration parameter to the HiveServer2Connection.cursor() method:

from impala.dbapi import connect

impersonation_configuration = {
    'hive.server2.proxy.user': 'impersonated_user',  # for HiveServer2
    'impala.doas.user': 'impersonated_user'  # for Impala
}

conn = connect(
    host='hive.server.host.fqdn',
    port=server_port,
    auth_mechanism='GSSAPI',
    kerberos_service_name='hive',
    database='db_name'
)
cursor = conn.cursor(configuration=impersonation_configuration)
cursor.execute('query to execute')

The user connecting to the cluster (one that is authenticated using Kerberos) must be allowed to impersonate other users in the cluster configuration.