Closed shidayang closed 1 year ago
@shidayang Can you describe in more detail what goal would you like to achieve? I cannot understand this target:
We need to support the use of superusers as trino users to access hdfs, so that the original permission system can be used
Because catalog user in ams is a super user in HDFS usually, and it have super authority for HDFS path corresponding to Arctic table. Without this function, all the trino query from different users access HDFS with this super user successfully, it makes the Ranger-HDFS invalid for trino users
For example, the arctic catalog_A had configurated the kerberos principal in ams. and the arctic table catalog_A.DBA.test_tbl
had the corresponding hdfs path /user/useA/hive_db/DBA.db/test_tbl/
. HDFS path can't be accessible for userB in Ranger-HDFS.
Now, userB query arctic table in trino, will use userA to access HDFS,and query successfully.
With this function, userA will proxy userB to access HDFS, and HDFS-Ranger will intercept the query from userB
as below, da_market
is userB, analysis_test
is UserA
Caused by: org.apache.hadoop.ipc.RemoteException: Permission denied: user=da_market, access=EXECUTE, inode="/user/analysis_test/hive_db/****.db/****"
As far as I can see, you want to be able to use the user's account rather than the one in the catalog configuration when doing permission verification in Trino.
Both Spark and Flink have the same requirement. The difference is that they usually use the user they configured when launching.
However, in an MPP system such as Trino, each query may use a different user account. So proxy users may be more appropriate.
Do I understand you correctly?
In addition, Flink and Spark do this by passing the user into catalog properties they wish to use when creating the catalog. I wonder if Trino reload catalog or tables when executing SQL for different users. Can we pass the certified user from catalog properties too ?
As far as I can see, you want to be able to use the user's account rather than the one in the catalog configuration when doing permission verification in Trino.
Both Spark and Flink have the same requirement. The difference is that they usually use the user they configured when launching.
However, in an MPP system such as Trino, each query may use a different user account. So proxy users may be more appropriate.
Do I understand you correctly?
Yes, each trino query may from different users, we want to use the configured super user to proxy these trino user, It achieve that HDFS can do permissions validation under existing Ranger system for different user.
In addition, Flink and Spark do this by passing the user into catalog properties they wish to use when creating the catalog. I wonder if Trino reload catalog or tables when executing SQL for different users. Can we pass the certified user from catalog properties too ?
Which user to use is determined by Trino account system
Search before asking
What would you like to be improved?
The user originally had a permission system based on Ranger. The kerberos user currently configured by Trino is a super user with all permissions, which will invalidate the permission system. We need to support the use of superusers as trino users to access hdfs, so that the original permission system can be used.
How should we improve?
No response
Are you willing to submit PR?
Subtasks
No response
Code of Conduct