apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
959 stars 302 forks source link

[#4000] improvement(client-python): Support simple auth for PyGVFS #4001

Closed xloya closed 2 months ago

xloya commented 3 months ago

What changes were proposed in this pull request?

Support simple auth for gravitino client in PyGVFS. The integration test depends on this PR: #3876 #3931 . When #3876 #3931 is merged, I will add integration tests and docs for this PR.

Why are the changes needed?

Fix: #4000

How was this patch tested?

Add UTs and ITs.

jerryshao commented 3 months ago

Unrelated to the PR here, I'm curious how do you support Kerberos HDFS for python gvfs, have you tried it?

xloya commented 3 months ago

Unrelated to the PR here, I'm curious how do you support Kerberos HDFS for python gvfs, have you tried it?

When I tested pygvfs locally, I directly connected to the online HDFS cluster which enabled Kerberos authentication. There are two ways I know:

  1. Add some configurations in the core-site.xml of the Hadoop environment:
    
    <property>
    <name>hadoop.security.authentication</name>
    <value>kerberos</value>
    </property>
hadoop.client.kerberos.principal xxx@HADOOP.COM hadoop.client.keytab.file /tmp/xxx.keytab

2. Execute `kinit` locally and login to Kerberos.
xloya commented 2 months ago

@jerryshao This PR is ready, please take a look, thanks.