dask / hdfs3

A wrapper for libhdfs3 to interact with HDFS from Python
http://hdfs3.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
136 stars 40 forks source link

Kerberos support on hdfs3 0.1.4, libgsasl 1.8.0 #167

Open KimiRaikking opened 6 years ago

KimiRaikking commented 6 years ago

I have conda installed hdfs3=0.1.4 which depends on libgsasl=1.8.0, I checked libgsasl.so , execute ldd libgsasl.so, I think it support GSSAPI , but i can not connect to hdfs which has kerberos auth. My code is like this:

import hdfs3
from hdfs3 import HDFileSystem
import subprocess
result = subprocess.getoutput('source /opt/client/bigdata_env; kinit -kt /opt/client/AIFlow.keytab AIFlow;klist')
ticket_path = "/tmp/krb5cc_1002"

print(result)
conf = {
    "hadoop.security.authentication" : "kerberos",
    "dfs.nameservices" : "hacluster,haclusterX,haclusterX1,haclusterX2,haclusterX3,haclusterX4",
    "dfs.ha.namenodes.hacluster" : "96,97",
    "dfs.namenode.rpc-address.hacluster.96" : "szvphicpra49709:25000",
    "dfs.namenode.rpc-address.hacluster.97" : "szvphicpra49710:25000"
}
hdfs1 = HDFileSystem(host="szvphicpra49709", port=25000 , pars = conf , user = "AIFlow", ticket_cache = ticket_path)

my libgsasl.so

#:/opt/notebook/anaconda3/lib # ldd libgsasl.so
        linux-vdso.so.1 (0x00007ffcf73fa000)
        libachk.so => /lib64/libachk.so (0x00007f0ab9cbc000)
        libgcrypt.so.11 => /usr/lib64/libgcrypt.so.11 (0x00007f0ab9a3c000)
        libgpg-error.so.0 => /opt/notebook/anaconda3/lib/././libgpg-error.so.0 (0x00007f0ab981c000)
        libgssapi_krb5.so.2 => /opt/notebook/anaconda3/lib/././libgssapi_krb5.so.2 (0x00007f0ab95d1000)
        libkrb5.so.3 => /opt/notebook/anaconda3/lib/././libkrb5.so.3 (0x00007f0ab92fb000)
        libk5crypto.so.3 => /opt/notebook/anaconda3/lib/././libk5crypto.so.3 (0x00007f0ab90cc000)
        libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f0ab8ec8000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f0ab8b24000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f0ab8920000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0ab8703000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f0ab84fa000)
        libcom_err.so.3 => /opt/notebook/anaconda3/lib/./././libcom_err.so.3 (0x00007f0ab82f6000)
        libkrb5support.so.0 => /opt/notebook/anaconda3/lib/./././libkrb5support.so.0 (0x00007f0ab80e9000)
        libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f0ab7ed2000)
        /lib64/ld-linux-x86-64.so.2 (0x000055bf094f9000)

my log is

ERROR Failed to setup RPC connection to "szvphicpra49709:25000" caused by:
RpcChannel.cpp: 743: Problem with callback handler
        @       Hdfs::Internal::UnWrapper<Hdfs::SafeModeException, Hdfs::SaslException, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing>::unwrap(char const*, int) 

and my namenode log is

2018-07-21 14:52:52,819 | WARN  | Socket Reader #1 for port 25000 | Auth failed for 10.186.67.15:54074:null (Problem with callback handler) with true cause: (Problem with callback handler) | Server.java:1467
2018-07-21 14:52:52,819 | INFO  | Socket Reader #1 for port 25000 | Socket Reader #1 for port 25000: readAndProcess from client ******* threw exception [javax.security.sasl.SaslException: Problem with callback handler [Caused by javax.security.sasl.SaslException: Client selected unsupported protection: 1]] | Server.java:928
xmm1989218 commented 6 years ago

I also met this problem

martindurant commented 6 years ago

I advise you to try with arrow's hdfs interface.

KimiRaikking commented 6 years ago

yes, i prefer to use pyarrow, and it works with kerberos.