Closed pkasinathan closed 8 years ago
For kerberos instigation following should be done:
1) compile libgsasl
with flag --with-gssapi-impl=mit
, described here (https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3/issues/18)
2) compile libhdfs3
3) in HDFileSystem
also pass pars = {"hadoop.security.authentication": "kerberos"}
and appropriate ticket_cache
(you can take it from klist
command)
We've use/used hdfs3
with kerberos. The testing setup can be found on the wiki: https://github.com/dask/hdfs3/wiki/Kerberos-Testing . Though, we have had some issues as noted here: https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3/issues/53. This happens hadoop.rpc.protection
is set to privacy
Thanks for the quick reply.
As per your suggestion, I installed libgsasl=1.8.0 using conda install -c anaconda libgsasl=1.8.0
command and it resolved my problem. I'm successfully able to access kerberized cluster using hdfs3 now.
You rock!
Thanks @jettify for chiming in! Closing.
@prabhu1984 I know it's been a while since this issue was raised, but thought I'd take a shot.Was seeking clarification regarding what settings exactly worked for you. Are the only things you did as follows?
-HDFileSystem(host=None, port=None, user=None, ticket_cache=None, token=None, pars=None, connect=True)
-conda install -c anaconda libgsasl=1.8.0
Or did you do other things suggested by others too, such as compile libgsasl
with flag --with-gssapi-impl=mit
?
I'm facing issues connecting to a kerberized cluster as well using hdfs3
. When I did exactly the two things shown above, my Dask job gets killed with the error: distributed.scheduler.KilledWorker: ('__call__-6af7aa29-2a09-45f3-a5e2-207c06562672', <Worker 'tcp://10.194.211.132:11927', memory: 0, processing: 1>)
Thanks for getting back to me. It’s an old issue. We are able to use hdfs3 to connect kerberized cluster. This issue can be closed.
@prabhu1984 I'm not a Dask developer, I'm just seeking your help on how you fixed the issue. Could you please share what setting exactly worked for you?
Hi Team,
Does hdfs3 support kerberos? I tried to follow this instruction
HDFileSystem(host=None, port=None, user=None, ticket_cache=None, token=None, pars=None, connect=True)
to connect to kerberized hdfs name node, but it's not working.Can you please give me some example or reference how to use hdfs3 to connect kerberized cluster?
Appreciate your support!
Thanks!