dask / hdfs3

A wrapper for libhdfs3 to interact with HDFS from Python
http://hdfs3.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
136 stars 40 forks source link

Mysterious error on a kerberos-enabled cluster #151

Closed superbobry closed 6 years ago

superbobry commented 6 years ago

Hello,

I'm getting the following error with hdfs3 0.3.0 on CDH5:

HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop python -c "import hdfs3; hdfs3.HDFileSystem()"
018-02-13 22:05:25.435199, p47628, th139670155556672, ERROR Failed to invoke RPC call "getFsStats" on server "9c-b6-54-7e-03-2c.hpc.criteo.preprod:8020":
RpcChannel.cpp: 483: HdfsRpcException: Failed to invoke RPC call "getFsStats" on server "9c-b6-54-7e-03-2c.hpc.criteo.preprod:8020"
    @   Hdfs::Internal::RpcChannelImpl::invokeInternal(std::shared_ptr<Hdfs::Internal::RpcRemoteCall>)
    @   Hdfs::Internal::RpcChannelImpl::invoke(Hdfs::Internal::RpcCall const&)
    @   Hdfs::Internal::NamenodeImpl::invoke(Hdfs::Internal::RpcCall const&)
    @   Hdfs::Internal::NamenodeImpl::getFsStats()
    @   Hdfs::Internal::NamenodeProxy::getFsStats()
    @   Hdfs::Internal::FileSystemImpl::getFsStats()
    @   Hdfs::Internal::FileSystemImpl::connect()
    @   Hdfs::FileSystem::connect(char const*, char const*, char const*)
    ...
ConnectionError: Connection Failed: HdfsRpcException: Failed to invoke RPC call "getFsStats" on server "9c-b6-54-7e-03-2c.hpc.criteo.preprod:8020"  Caused by: HdfsRpcServerException: org.apache.hadoop.security.authorize.AuthorizationException: User: s.lebedev@CRITEOIS.LAN is not allowed to impersonate H����H����ty��t1H�H��H��H�t>H��H��H�59�

The binary gibberish in the error is surprisingly stable. Could you give any pointers on how I can debug the root cause of this?

martindurant commented 6 years ago

I can tell you what it means: the user indicated (I suppose your Keberos principal) is not a proxy user; this is not a surprise, usually only cluster services like HDFS, YARN and such are proxy users. I cannot tell you why this is happening, though. A workaround would be to set that user to be proxying (by adding *.proxyuser.* entries in your core-site.xml), but I very much doubt you want to do that.

martindurant commented 6 years ago

For fun, you could try with the following package https://anaconda.org/mdurant/libhdfs3/2.3/download/linux-64/libhdfs3-2.3-0.tar.bz2 Download that, and do conda install libhdfs3-2.3-0.tar.bz2 and try again. Just for fun.

superbobry commented 6 years ago

I've tried the version above and it seems to fails with the same error message. Looking for possible causes...

superbobry commented 6 years ago

I've reinstalled libhdfs3/hdfs3 from conda-forge into a fresh conda environment and everything just works now. Closing the issue.

martindurant commented 6 years ago

Hurray! How odd...