Open nartal1 opened 2 years ago
I can work on it
Should we put the changes in Shim layers, or just modify the code in place?
If I understand this correctly, I don't think it makes sense to do this change unless we really need any performance gain from it, as it would . The original PR modified Expression
and relies on a new method to be implemented by expressions. Until Spark 3.3 is the minimum Spark spec we support, we cannot rely on Expression
objects providing or having this method, because they will derive from Spark's Expression
and old Spark versions don't have the required method definition. We could override it in our ShimExpression
wrappers, but this doesn't handle Expression
objects coming from the original CPU plan.
I think we would need to see a demonstrated need for this (i.e.: some performance metrics showing how much can be gained on real-world queries) before prioritizing this work.
Is your feature request related to a problem? Please describe. We have copied some of the code from
Object Canonicalize
. Need to evaluate if this change should be pulled in https://github.com/apache/spark/commit/0a6be8cf59 .