Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
1.22k
stars
437
forks
source link
[GLUTEN-7028][CH][Part-8] Support one pipeline write for partition mergetree #7924
Closed
baibaichen closed 1 week ago
What changes were proposed in this pull request?
(Fixes: #7028) The following digram shows the current class hierarchy,
SparkPartitionedBaseSink
inherits from ch'sDB::PartitionedSink
The partition MergeTree in pipeline write looks like this, it squashes block before partitiion for whole input:
It differs from spark 3.3 which squashes block after partitiion for each partition, since parition is triggerd by JVM.
The new implemwentation is same as clickhouse.
How was this patch tested?
Using existed UTs