Open felipepessoto opened 12 months ago
As for Velox backend, the computation of HashPartitioning
partition id is achieved by employing a Pre-projection operator before shuffle operator. So, it requires the first column to be integer type on native side computation.
We should always rely on Gluten's ColumnarRules to make the SQL being planned correctly.
I tried to add a project similarly to getProjectWithHash method but then started to receive this error.
I’ll try again to double check.
[info] Error Source: RUNTIME [info] Error Code: INVALIDSTATE [info] Reason: (16384 vs. 32768) [info] Retriable: False [info] Expression: values->capacity() >= byteSize [info] Function: FlatVector [info] File: /__w/1/s/Velox/velox/vector/FlatVector.h It seems it is trying to read the first column, which should be the result of hash (Int32), but it is actually the first columns from table (which is a int 64)
I tried to add a project similarly to getProjectWithHash method but then started to receive this error.
I’ll try again to double check.
[info] Error Source: RUNTIME [info] Error Code: INVALIDSTATE [info] Reason: (16384 vs. 32768) [info] Retriable: False [info] Expression: values->capacity() >= byteSize [info] Function: FlatVector [info] File: /__w/1/s/Velox/velox/vector/FlatVector.h It seems it is trying to read the first column, which should be the result of hash (Int32), but it is actually the first columns from table (which is a int 64)
Could you share the code for the implemented getProjectWithHash
function? The murmurhash3
in Velox has the return type of int32, so the expected data type of the "hash partition id" column is also int32.
Additionally, I'm curious about the approach you're using to invoke the API, as it appears to differ from the typical usage of Gluten.
This is the version with the project.
test("test spark gluten code only - smaller code") {
val sourcePath = "/tmp/test/source"
val outputPath = "/tmp/test/target"
val pColumn = "colC"
val isTablePartitioned = true
val targetFileSize = 134217728
val numShufflePartitions = spark.sessionState.conf.numShufflePartitions
val df = createDf(sourcePath, pColumn)
val physicalPlan = df.queryExecution.executedPlan
println("\n\n physicalPlan: " + physicalPlan)
val originalPlan = physicalPlan
//Remove top C2R
val planWithoutC2R = physicalPlan.children(0)
val partitionColumnsExpr = Array(planWithoutC2R.output.find(c => c.name.equals(pColumn)).get)
val partitioning = HashPartitioning(partitionColumnsExpr, numShufflePartitions)
// These two works
// SinglePartition
// RoundRobinPartitioning(numShufflePartitions)
val planWithProject = getProjectWithHash(partitionColumnsExpr, planWithoutC2R) // Added this
val shuffleExec = ShuffleExchangeExec(partitioning, planWithProject)
val transformedShuffleExec =
ColumnarShuffleExchangeExec(shuffleExec, planWithProject, planWithProject.output.drop(1)) // planWithProject.output)
// TransformHints.tag(shuffleExec, transformedShuffleExec.doValidate().toTransformHint)
val addC2R = false // true
val finalPlan = if (addC2R) {
VeloxColumnarToRowExec(transformedShuffleExec)
} else {
transformedShuffleExec
}
println("\n\n finalPlan: " + finalPlan)
println("\n\n finalPlan.supportsColumnar: " + finalPlan.supportsColumnar)
if(finalPlan.supportsColumnar) {
println("executeColumnar: " + finalPlan.executeColumnar().first().getRow(0))
} else {
println("execute: " + finalPlan.execute().first().getLong(0))
}
}
def createDf(sourcePath: String, pColumn: String): DataFrame = {
val dfSource = spark
.range(5000)
.map { _ =>
(10L,
11,
scala.util.Random.nextInt(2))
}
.repartition(100)
.toDF("colA", "colB", "colC")
dfSource.write.partitionBy(pColumn).format("parquet").mode("overwrite").save(sourcePath)
val df = spark.read.format("parquet").load(sourcePath)
df
}
private def getProjectWithHash(exprs: Seq[Expression], child: SparkPlan): SparkPlan = {
val hashExpression = new Murmur3Hash(exprs)
hashExpression.withNewChildren(exprs)
val project = ProjectExec( // Also tried ProjectExec / ProjectExecTransformer
Seq(Alias(hashExpression, "hash_partition_key")()) ++ child.output, child)
AddTransformHintRule().apply(project)
val projectWithHint = TransformHints.getHint(project) match {
case _: TRANSFORM_SUPPORTED =>
// Tested with TransformPreOverrides(true/false)
println("TRANSFORM_SUPPORTED")
TransformPreOverrides(true).replaceWithTransformerPlan(project)
case _: TRANSFORM_UNSUPPORTED =>
println("TRANSFORM_UNSUPPORTED")
project
}
val transformStageCounter = ColumnarCollapseTransformStages.transformStageCounter
WholeStageTransformer(projectWithHint)(transformStageCounter.incrementAndGet())
}
And this is the error:
[info] - test spark gluten code only - smaller code *** FAILED *** (6 seconds, 88 milliseconds)
[info] java.lang.UnsupportedOperationException: This operator doesn't support doTransform with SubstraitContext.
[info] at io.glutenproject.execution.TransformSupport.doTransform(WholeStageTransformer.scala:71)
[info] at io.glutenproject.execution.TransformSupport.doTransform$(WholeStageTransformer.scala:69)
[info] at io.glutenproject.execution.WholeStageTransformer.doTransform(WholeStageTransformer.scala:96)
[info] at io.glutenproject.execution.ProjectExecTransformer.doTransform(BasicPhysicalOperatorTransformer.scala:221)
[info] at io.glutenproject.execution.WholeStageTransformer.generateWholeStageTransformContext(WholeStageTransformer.scala:189)
[info] at io.glutenproject.execution.WholeStageTransformer.doWholeStageTransform(WholeStageTransformer.scala:222)
[info] at io.glutenproject.execution.WholeStageTransformer.$anonfun$doExecuteColumnar$1(WholeStageTransformer.scala:275)
[info] at io.glutenproject.metrics.GlutenTimeMetric$.withNanoTime(GlutenTimeMetric.scala:41)
[info] at io.glutenproject.metrics.GlutenTimeMetric$.withMillisTime(GlutenTimeMetric.scala:46)
[info] at io.glutenproject.execution.WholeStageTransformer.doExecuteColumnar(WholeStageTransformer.scala:290)
[info] at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeColumnar$1(SparkPlan.scala:257)
@marin-ma I tried to change it to rely on Gluten API to add the projection. I still seeing some errors, am I using the API incorrectly?
In summary I'm adding a ShuffleExchangeExec and calling replaceWithTransformerPlan which automatically adds the project. It fails because it doesn't support columnar, so I need to wrap it with WholeStage, then it throws another error:
Without WholeStage:
[info] - test spark gluten code only - smaller code FAILED (5 seconds, 920 milliseconds) [info] java.lang.UnsupportedOperationException: This operator doesn't support doExecuteColumnar(). [info] at io.glutenproject.execution.ProjectExecTransformer.doExecuteColumnar(BasicPhysicalOperatorTransformer.scala:297) [info] at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeColumnar$1(SparkPlan.scala:257)
After adding WholeStage:
[info] java.lang.UnsupportedOperationException: This operator doesn't support doTransform with SubstraitContext. [info] at io.glutenproject.execution.TransformSupport.doTransform(WholeStageTransformer.scala:71) [info] at io.glutenproject.execution.TransformSupport.doTransform$(WholeStageTransformer.scala:69) [info] at io.glutenproject.execution.WholeStageTransformer.doTransform(WholeStageTransformer.scala:96) [info] at io.glutenproject.execution.ProjectExecTransformer.doTransform(BasicPhysicalOperatorTransformer.scala:221) [info] at io.glutenproject.execution.WholeStageTransformer.generateWholeStageTransformContext(WholeStageTransformer.scala:189)
val df = createDf(sourcePath, pColumn)
val physicalPlan = df.queryExecution.executedPlan
println("\n\n physicalPlan: " + physicalPlan)
// Remove top C2R
val planWithoutC2R = physicalPlan.children(0)
println("\n\n planWithoutC2R: " + planWithoutC2R)
// Add ShuffleExchangeExec
val partitionColumnsExpr = Array(planWithoutC2R.output.find(c => c.name.equals(pColumn)).get)
val partitioning: Partitioning = HashPartitioning(partitionColumnsExpr, numShufflePartitions)
val shuffleExec = ShuffleExchangeExec(partitioning, planWithoutC2R, REPARTITION_BY_COL)
println("\n\n shuffleExec: " + shuffleExec)
val finalPlan = {
// Add C2R
val planWithC2R = VeloxColumnarToRowExec(shuffleExec)
AddTransformHintRule().apply(planWithC2R)
val afterTransform = TransformPreOverrides(false).replaceWithTransformerPlan(planWithC2R).asInstanceOf[VeloxColumnarToRowExec]
println("\n\n afterTransform: " + afterTransform)
val wrapProjectWithWholeStage = true
if (wrapProjectWithWholeStage) {
// If I don't wrap the project, it throws a error: This operator doesn't support doExecuteColumnar().
val transformStageCounter = ColumnarCollapseTransformStages.transformStageCounter
val projectWrapped = WholeStageTransformer(afterTransform.child.asInstanceOf[ColumnarShuffleExchangeExec].child)(transformStageCounter.incrementAndGet())
val afterSurgery = afterTransform.makeCopy(Array(afterTransform.child.asInstanceOf[ColumnarShuffleExchangeExec].copy(child = projectWrapped)))
AddTransformHintRule().apply(afterSurgery)
TransformPreOverrides(false).replaceWithTransformerPlan(afterSurgery)
} else {
afterTransform
}
}
println("\n\n finalPlan: " + finalPlan)
println("\n\n finalPlan.supportsColumnar: " + finalPlan.supportsColumnar)
println("execute: " + finalPlan.execute().first().getLong(0))
Forgot to mention, another test I did, is wrapping ShuffleExchangeExec with ColumnarShuffleExchangeExec:
val transformedShuffleExec = // shuffleExec
ColumnarShuffleExchangeExec(shuffleExec, planWithProject, planWithoutC2R.output)
In this case I don't receive the java.lang.UnsupportedOperationException: This operator doesn't support doExecuteColumnar().
. But then I'm back to the RecordBatch field 0 should be integer
test("test spark gluten code only - smaller code") {
val sourcePath = "/tmp/test/source"
val outputPath = "/tmp/test/target"
val pColumn = "colC"
val numShufflePartitions = spark.sessionState.conf.numShufflePartitions
val df = createDf(sourcePath, pColumn)
val physicalPlan = df.queryExecution.executedPlan
println("\n\n physicalPlan: " + physicalPlan)
val originalPlan = physicalPlan
//Remove top C2R
val planWithoutC2R = physicalPlan.children(0)
println("\n\n planWithoutC2R: " + planWithoutC2R)
val partitionColumnsExpr = Array(planWithoutC2R.output.find(c => c.name.equals(pColumn)).get)
val partitioning: Partitioning = HashPartitioning(partitionColumnsExpr, numShufflePartitions)
// These two works
// SinglePartition
// RoundRobinPartitioning(numShufflePartitions)
val shuffleExec = ShuffleExchangeExec(partitioning, planWithoutC2R, REPARTITION_BY_COL)
val transformedShuffleExec = ColumnarShuffleExchangeExec(shuffleExec, planWithoutC2R, planWithoutC2R.output)
AddTransformHintRule().apply(transformedShuffleExec)
val finalPlan = TransformPreOverrides(false).replaceWithTransformerPlan(transformedShuffleExec)
println("\n\n finalPlan: " + finalPlan)
println("execute: " + VeloxColumnarToRowExec(finalPlan).execute.first().getLong(0))
}
def createDf(sourcePath: String, pColumn: String): DataFrame = {
val dfSource = spark
.range(5000)
.map { _ =>
(10L,
11,
scala.util.Random.nextInt(2))
}
.repartition(100)
.toDF("colA", "colB", "colC")
dfSource.write.partitionBy(pColumn).format("parquet").mode("overwrite").save(sourcePath)
val df = spark.read.format("parquet").load(sourcePath)
df
}
I partially found the problem and fix it with a hack, explaining it inline. Chances are I'm doing something wrong as this is my first time using Gluten/Velox.
Assuming planWithoutC2R is a simple:
+- ^(11) NativeFileScan parquet [colA#552L,colB#553,colC#554] Batched: true, DataFilters: [], Format: Parquet, Location: PreparedDeltaFileIndex(1 paths)[file:/tmp/spark-8db878d9-c7e8-4f16-8c53-ebfe2e28f35b], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<colA:bigint,colB:int>
val partitionColumnsExpr = Array(planWithoutC2R.output.find(c => c.name.equals(pColumn)).get)
val partitioning: Partitioning = HashPartitioning(partitionColumnsExpr, numShufflePartitions)
val shuffleExec = ShuffleExchangeExec(partitioning, planWithoutC2R, REPARTITION_BY_COL)
AddTransformHintRule().apply(shuffleExec)
val finalPlanP = TransformPreOverrides(true).replaceWithTransformerPlan(shuffleExec)
.asInstanceOf[ColumnarShuffleExchangeExec]
println("\n\n finalPlanP: " + finalPlanP.supportsColumnar + " - " + finalPlanP)
// VeloxColumnarToRowExec(finalPlanP).execute.count
// I expect VeloxColumnarToRowExec(finalPlan).execute.count to work
// But it throws an error saying ProjectExecTransformer doesn't support columnar
// ProjectExecTransformer needs to be wrapped with WholeStageTransformer
// For some reason the replaceWithTransformerPlan doesnt wrap it and if
// we wrap it manually it doesnt work with error:
// [info] java.lang.UnsupportedOperationException: This operator doesn't support doTransform with SubstraitContext.
// [info] at io.glutenproject.execution.TransformSupport.doTransform(WholeStageTransformer.scala:71)
// [info] at io.glutenproject.execution.TransformSupport.doTransform$(WholeStageTransformer.scala:69)
// [info] at io.glutenproject.execution.WholeStageTransformer.doTransform(WholeStageTransformer.scala:96)
// [info] at io.glutenproject.execution.ProjectExecTransformer.doTransform(BasicPhysicalOperatorTransformer.scala:221)
// This first hack didn't work
// val transformStageCounter = ColumnarCollapseTransformStages.transformStageCounter
// val projectWrapped = WholeStageTransformer(project)(transformStageCounter.incrementAndGet())
// val finalPlanP2 = finalPlanP.makeCopy(Array(WholeStageTransformer(
// finalPlanP.children(0))(transformStageCounter.incrementAndGet())))
// println("\n\n finalPlanP2: " + finalPlanP2.supportsColumnar + " - " + finalPlanP2)
// I workaround it by using TakeOrderedAndProjectExecTransformer, which supports Columnar without WholeStageTransformer
val project = finalPlanP.child.asInstanceOf[ProjectExecTransformer]
val projectWrapped = TakeOrderedAndProjectExecTransformer(Int.MaxValue, Seq.empty, project.projectList, project.child)
val finalPlanP2 = finalPlanP.copy(outputPartitioning = finalPlanP.outputPartitioning, child = projectWrapped)
AddTransformHintRule().apply(finalPlanP2)
val finalPlan = finalPlanP2
println("\n\n finalPlan: " + finalPlan)
println("execute: " + VeloxColumnarToRowExec(finalPlan).execute.count)
Another thing I may need to find an workaround, I was planning to inherit from Exchange or ShuffleExchangeLike, but Gluten uses the ShuffleExchangeExec (which is a case class): https://github.com/oap-project/gluten/blob/18b33d0cb4dfcd54edbfebeb13f0d2d7eb3dcaf6/gluten-core/src/main/scala/io/glutenproject/extension/ColumnarOverrides.scala#L381
@felipepessoto Could you let us know the reason you are using the API in this manner? Gluten provides lots of columnar rules to ensure the correct formulation of the final physical plan. It appears, however, that you might not be fully utilizing these columnar rules of Gluten to guarantee the optimization of the final physical plan. Instead, it seems you are attempting to rewrite some logic within the columnar rules. Understanding your rationale or method here would be helpful.
@marin-ma I'm writing a Columnar version of Optimized Write: https://github.com/delta-io/delta/pull/1198/files#diff-5648029472acd991211c6216d31879fe87f33e6aba46653e7a10ccd3bcff8389 (OptimizeWriteExchangeExec).
My initial expectation was I'd only need to add a doExecuteColumnar()
, replacing ShuffledRowRDD
by ShuffledColumnarBatchRDD
and a shuffleDependency
that uses ColumnarShuffleExchangeExec.prepareShuffleDependency
instead of ShuffleExchangeExec.prepareShuffleDependency
. Which works. as long I don't use HashPartitioning.
To use HashPartitioning I had to add the call to replaceWithTransformerPlan, and the hack to fix that plan generated by it.
Thanks.
@marin-ma, with https://github.com/oap-project/gluten/pull/4167 and changing OptimizeWriteExchangeExec to inherit from ShuffleExchangeLike I can make it work (at least it doesn't throw errors, I'm still validating the behavior). But still requires workarounds.
Could you help me understand what I'm doing wrong? I have some questions to understand how Gluten works:
AddTransformHintRule().apply(owPlanWraped)
TransformPreOverrides(false).replaceWithTransformerPlan(owPlanWraped)
In some cases we don't call it:
TransformPreOverrides
with false
or true
parameter?ProjectExecTransformer
(that was created by replaceWithTransformerPlan
) by replacing it by TakeOrderedAndProjectExecTransformer
. Do you know what I'm doing wrong?
3.1. Why ProjectExecTransformer
needs a WholeStage
wrapper while TakeOrderedAndProjectExecTransformer
doesn't?
3.2. The missing WholeStage
could be a bug in Gluten side? Given the ProjectExecTransformer
is created by replaceWithTransformerPlan
?
3.3. Why I receive a This operator doesn't support doTransform with SubstraitContext
error if I try to manually wrap the ProjectExecTransformer
@marin-ma I'm writing a Columnar version of Optimized Write: https://github.com/delta-io/delta/pull/1198/files#diff-5648029472acd991211c6216d31879fe87f33e6aba46653e7a10ccd3bcff8389 (OptimizeWriteExchangeExec).
My initial expectation was I'd only need to add a
doExecuteColumnar()
, replacingShuffledRowRDD
byShuffledColumnarBatchRDD
and ashuffleDependency
that usesColumnarShuffleExchangeExec.prepareShuffleDependency
instead ofShuffleExchangeExec.prepareShuffleDependency
. Which works. as long I don't use HashPartitioning.To use HashPartitioning I had to add the call to replaceWithTransformerPlan, and the hack to fix that plan generated by it.
Thanks.
@felipepessoto No matter using hash partitioning or not, Gluten's rules are used to ensure the final physical plan being generated correctly. Therefore, I still suggest we should use GlutenPlugin rather than rewriting it.
https://github.com/delta-io/delta/pull/1198/files#diff-da2c9be25dd00a5a2abbdabc45387415eb511c7dba019cfbb175c222286fc5f5R306-R309
I have a question here. Looks like it adds OptimizedWriteExec
by directly modifying the QueryExecution.executedPlan
. Does this modification take effect for both AQE on/off?
Yes, it should work even when AQE is enabled.
The reason I need to rewrite physical plan is the existing repartition/rebalance generates a big large partition and write it as a file. Delta Optimized Write improves it generating files close to the target size, this is only possible by changing physical plan.
Backend
VL (Velox)
Bug description
When I use ColumnarShuffleExchangeExec and ShuffleExchangeExec with HashPartitioning I receive the error below. I tried to isolate the issue and created this repro code. It only fails when partitioning is HashPartitioning. Single and RoundRobin works fines.
I created it based on examples I found in the Gluten source, like https://github.com/oap-project/gluten/blob/81bb6c9b0652ec4df39e6f50c0405756b59d5a3d/gluten-core/src/main/scala/io/glutenproject/execution/TakeOrderedAndProjectExecTransformer.scala#L103
Repro code:
Spark version
Spark-3.3.x
Spark configurations
Running using unit tests
System information
No response
Relevant logs