Open ShawHee opened 6 months ago
Any update on this issue? Can you help to check if this bug still exists in the master branch? @ShawHee
@zhoujinsong Yes, it still exists in the master branch. I modified my environment and added doAs in SparkBatchScan, and it took effect.
What happened?
In the process of using Amoro, I found that the rpc response time was unstable when connecting to the router, resulting in slow query of the Amoro table. Therefore, I want to connect namanode directly through amoro to optimize query speed. During the testing process, it was found that spark queries would experience exceptions. The reason is: when spark creates a task, the client will first register the token with the router. Later, when yarn starts the container, it will get this token to authenticate with hdfs. However, since this token is registered with the router, and after the task starts executing, the amoro file is actually directly connected to the namenode, so the token cannot be found on the namenode, so the authentication fails. Theoretically, Amoro uses the keytab and principal authentication in the Amoro catalog to read data, and should not use the yarn token.
Affects Versions
master
What engines are you seeing the problem on?
No response
How to reproduce
No response
Relevant log output
Anything else
No response
Are you willing to submit a PR?
Code of Conduct