Open GaryShen2008 opened 1 year ago
Looks like we have a lambda inside of GPuProjectExec that is pulling in the ProjectExec itself. We likely have this problem in a lot of other places too.
They have shown that withResource
was the culprit in this particular case, since they removed withResource
and the project exec didn't need serializing. But I agree that there may be other things we are doing causing serialization to trigger in other places. Kyuubi did fix a similar issue, but it is in an unreleased version (https://github.com/apache/kyuubi/issues/4617).
I can take this on to be done by next sprint. Virtually all files need to change, and it doesn't seem to be a trivial script that could handle it. So it is just going to take time.
If it is for 23.04 and is really important it would be great to know. @sameerz @GaryShen2008
This issue happened when using Kyuubi 1.7.0 with Ranger authorization. Kyuubi has fixed the serializable issue in their latest code, but it's not released yet. I hope to have a simple fixing in GpuProjectExec to make it unblocking our usage of Kyuubi 1.7.0 if possible. Otherwise we'll need to use their master-snapshot version to bypass this issue with our coming 23.04 release. And I don't know when Apache Kyuubi will release the next version. I hope we don't need to depend on that.
Update one thing, I double tested the Kyuubi's master-snapshot image, the issue still occurred. In that case, it seems a MUST fix in our plugin side.
@GaryShen2008 can you confirm that the merged PR resolves the issue? If so, please close this issue.
Describe the bug When testing Kyuubi spark authorization with Ranger, I got the below exception.
Steps/Code to reproduce bug A local reproducible step as below. Launch a Spark shell with spark-rapids, run below code.
Seq(1,2,3,4,5).toDF("a").repartition(1).show
Expected behavior Even the child plan isn't serializable, the job shouldn't fail.
Environment details (please complete the following information) Any