Open AuthurWang2009 opened 1 year ago
Hello @AuthurWang2009, Thanks for finding the time to report the issue! We really appreciate the community's efforts to improve Apache Kyuubi.
The error message simply shows the user actually running the Spark application does not allow to read the target file. It's HDFS's behaviour and the action was rejected by HDFS itself. Kyuubi Server and Kyuubi's Authz plugin for Ranger never check file permission.
Does Spark application have any methods to overpass the permission of hdfs。 In my option, if we execute sql with authz plugin, the plugin will the allow/disallow the sql to execute in parse stage. In running stage, it should use super user of kyuubi server to overcome the lack of permission. Otherwise, the authz will do not take effect because of other permission situation。
The error message simply shows the user actually running the Spark application does not allow to read the target file. It's HDFS's behaviour and the action was rejected by HDFS itself. Kyuubi Server and Kyuubi's Authz plugin for Ranger never check file permission.
Again, it's nothing to do with the authz plugin, which is only responsible for checking privileges for the session user on targeted privilege objects (eg. tables, columns.) with Ranger. It never concerns or intercepts files' operation. Maybe a more general share level should be considered in your case. For example with the server share level, an engine is submitted one time with a specific proxy-user and shared by all the sessions.
The execution workflow of kyuubi looks like this: 1、the server pulls policies of hive service in ranger with server principal and keytab 2、the server parses the sql and maybe access hive metastore for more information about table info of view with real user 3、the server submit the sql,and launch spark app with real user to do the job
in this situation, step 2 and step 3 can probably run into exception: 1、table access permission do not configured in hive service, and hive service only configures the view access condition, so the real user access hive metastore will be disallowed. 2、spark app translates logic plan to physical plan, and read hdfs file accordingly. and hdfs service in ranger will deny spark app to access them, for it have no permission to access hdfs file directly.
So, How can we work around without changing the security policy?
By the way, the hdfs permission is also ruled by ranger。Should we configure policy for every table? That sounds unreasonable for table and view situation。 We only configure policy for mapreduce or other apps which has no tabel or view,and use hdfs dirctly.
Code of Conduct
Search before asking
Describe the bug
1、First, we create a user bdhmgmas, grant database permission of database bs_comdb to it, and then create a table bs_comdb.test1 in database bs_comdb, the ddl of table is like this: use bs_comdb; create table test1(a string) location '/user/bdhmgmas/db/bs_comdb/test1'; insert into bs_comdb.test1 values '1','2','3';
2、Second, we create another user bdhcwbkj, and then grant database permission of database bs_cwbdb to it,and then create a view bs_cwbdb.viw1 which refers to the table bs_comdb.test1, the ddl of view like this: use bs_cwbdb; create view viw1 as select * from bs_comdb.test1;
3、Third, we change the owner of the hdfs file '/user/bdhmgmas/db/bs_comdb/test1' to hdfs:supergroup, and the permission of it to 711 by the following commands:
hadoop fs -chown -R hdfs:supergroup /user/bdhmgmas/db/bs_comdb/test1 hadoop fs -chmod -R 711 /user/bdhmgmas/db/bs_comdb/test1
4、finally, we connect to kyuubi jdbc server, and query the view with user bdhcwbkj by the following commands: kinit -p bdhcwbkj -kt ~/keytab/apps.keytab -c /tmp/bdhcwbkj_ccc export KRB5CCNAME=/tmp/bdhcwbkj_ccc $HOME/bss_home/kyuubi/bin/beeline -u "jdbc:kyuubi://172.21.21.129:10009/default;kyuubiServerPrincipal=hive/_HOST@BG.COM" --hiveconf spark.yarn.queue=root.000kjb.bdhmgmas_bas
The query throws an exception: Error: org.apache.kyuubi.KyuubiSQLException: org.apache.kyuubi.KyuubiSQLException: Error operating ExecuteStatement: org.apache.hadoop.security.AccessControlException: Permission denied: user=bdhcwbkj, access=READ_EXECUTE, inode="/user/bdhmgmas/db/bs_comdb/test1":hdfs:supergroup:drwx--x--x
6、My env is: spark 3.3.1 kyuubi 1.8.0 hdfs, hive and ranger is in CDP7.1.7sp1 platform, hdfs and hive both enable kerberos and ranger。
Affects Version(s)
master
Kyuubi Server Log Output
Kyuubi Engine Log Output
Kyuubi Server Configurations
Kyuubi Engine Configurations
Additional context
No response
Are you willing to submit PR?