apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.22k stars 440 forks source link

[CH] Got Exception: The order of aggregation result columns is invalid #8142

Open lgbo-ustc opened 11 hours ago

lgbo-ustc commented 11 hours ago

Backend

CH (ClickHouse)

Bug description

Job aborted due to stage failure: Task 0 in stage 323.0 failed 2 times, most recent failure: Lost task 0.1 in stage 323.0 (TID 46601) (sg-dn3538.bigdata.bigo.inner executor 2054): org.apache.gluten.exception.GlutenException: The order of aggregation result columns is invalid
0. ../contrib/llvm-project/libcxx/include/exception:141: Poco::Exception::Exception(String const&, int) @ 0x0000000014a82559
1. ./build_new/../src/Common/Exception.cpp:109: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x00000000069dfc39
2. ../src/Common/Exception.h:111: DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000689598c
3. ../src/Common/Exception.h:129: DB::Exception::Exception<>(int, FormatStringHelperImpl<>) @ 0x00000000068879eb
4. ./build_new/../utils/extern-local-engine/Parser/RelParsers/AggregateRelParser.cpp:98: local_engine::AggregateRelParser::parse(std::unique_ptr<DB::QueryPlan, std::default_delete<DB::QueryPlan>>, substrait::Rel const&, std::list<substrait::Rel const*, std::allocator<substrait::Rel const*>>&) @ 0x0000000006df0284
5. ./build_new/../utils/extern-local-engine/Parser/SerializedPlanParser.cpp:277: local_engine::SerializedPlanParser::parseOp(substrait::Rel const&, std::list<substrait::Rel const*, std::allocator<substrait::Rel const*>>&) @ 0x0000000006dac92e
6. ./build_new/../utils/extern-local-engine/Parser/SerializedPlanParser.cpp:212: local_engine::SerializedPlanParser::parse(substrait::Plan const&) @ 0x0000000006dabe6f
7. ./build_new/../utils/extern-local-engine/Parser/SerializedPlanParser.cpp:226: local_engine::SerializedPlanParser::createExecutor(substrait::Plan const&) @ 0x0000000006dad30f
8. ./build_new/../utils/extern-local-engine/local_engine_jni.cpp:270: 

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

No response

lgbo-ustc commented 11 hours ago

The grouping keys are

coalesce(col_0,col_15,right_3032.col_0),
coalesce(col_1,col_16,right_3032.col_1),
coalesce(col_2,col_17,right_3032.col_2),
coalesce(col_3,col_18,all_3033),
coalesce(col_4,col_19,right_3032.col_3),
coalesce(col_5,right_3032.col_4,all_3034),
coalesce(col_20,0_3035),
coalesce(col_21,0_3044),
coalesce(col_6,0_3036),
coalesce(col_7,0_3037),
coalesce(col_22,0_3045),
coalesce(col_24,0_3038),
coalesce(col_13,0_3039),
coalesce(col_23,0_3046),
coalesce(sparkDivide(CAST(coalesce(col_13,0_3051),Float64_3052),CAST(coalesce(col_6,0_3053),Float64_3054)),0_3055),
coalesce(col_8,0_3047),
coalesce(col_9,0_3048),
coalesce(col_10,0_3040),
coalesce(col_14,0_3041),
coalesce(sparkDivide(CAST(coalesce(col_14,0_3056),Float64_3057),CAST(coalesce(col_10,0_3058),Float64_3059)),0_3060),
coalesce(col_11,0_3049),
coalesce(col_12,0_3050),
coalesce(right_3032.col_5,0_3042),
coalesce(right_3032.col_6,0_3043),
coalesce(sparkDivide(CAST(coalesce(right_3032.col_6,0_3061),Float64_3062),CAST(coalesce(right_3032.col_5,0_3063),Float64_3064)),0_3065),
coalesce(col_0,col_15,right_3032.col_0)

There are duplicated coalesce(col_0,col_15,right_3032.col_0) in the grouping keys and header.

lgbo-ustc commented 10 hours ago

Does not following distinct work? https://github.com/apache/incubator-gluten/blob/ff945f99452cf471122255fe5f549e836a86aee4/backends-clickhouse/src/main/scala/org/apache/gluten/backendsapi/clickhouse/CHSparkPlanExecApi.scala#L165-L172

lgbo-ustc commented 9 hours ago

some related PRs, #7368 #7101