cloudera / hue

Open source SQL Query Assistant service for Databases/Warehouses
https://cloudera.com
Apache License 2.0
1.17k stars 366 forks source link

Batch mode query hitting LDAP login error #1955

Closed rwu-wish closed 3 years ago

rwu-wish commented 3 years ago

Describe the bug:

Environment: Hue 4.8 (ldap enabled) Hive 3.1.2 (Hiveserver2 Ldap enabled) In Ldap authenticated Hue, submitting a hive query using the Hive editor using "batch" mode fails due to

Error: Could not open client transport with JDBC Uri: jdbc:hive2://<ip-address>:10000/default: Peer indicated failure: Error validating the login (state=08S01,code=0)

I'm successfully logged in to Hue using ldap credentials and can successfully run Hive queries using the "Execute" mode. Hue is able to impersonate my user and run hive queries to ldap authenticated hiveserver2. However when I use "Batch" mode. The beeline connection from the oozie job fails to login. Is there any additional configuration needed for batch mode with ldap?

Steps to reproduce it?

1) Enable Hue with Ldap authentication , https://docs.cloudera.com/documentation/enterprise/latest/topics/hue_sec_ldap_auth.html 2) Enable Hive server2 with Ldap authentication https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.0/securing-hive/content/hive_secure_hiveserver_using_ldap.html 3) Run simple query in Hive editor using batch mode. I ran a select * from my_table limit 10

Hue version or source? (e.g. open source 4.5, CDH 5.16, CDP 1.0...). System info (e.g. OS, Browser...). open source 4.8

Further investigation

Looking at the workflow.xml created by Hue for the oozie job. It seems the HS2 connection doesn't have password set and the user is my logged in user rather than the ldap auth_username used for ldap passthrough to hive.

In hue.ini following https://gethue.com/ldap-or-pam-pass-through-authentication-with-hive-or-impala/

[beeswax]

auth_username=hue_hive

auth_password=hue_hive_pwd

When running as "Execute" query mode, these user/pass are used by hue to query HS2, but in Batch mode these doesn't seem used.

Workflow.xml:

$ hdfs dfs -cat /user/hue/oozie/deployments/workflow.xml

<workflow-app name="Batch job for query-hive" xmlns="uri:oozie:workflow:0.5">
    <start to="hive-e22a"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="hive-e22a">
        <hive2 xmlns="uri:oozie:hive2-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <jdbc-url>jdbc:hive2://<ip-address>:10000/default</jdbc-url>
            <script>${wf:appPath()}/hive-e22a.sql</script>
        </hive2>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <end name="End"/>
</workflow-app>
github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity and is not "roadmap" labeled or part of any milestone. Remove stale label or comment or this will be closed in 5 days.