apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.22k stars 437 forks source link

[CH] Refactor JIT settings in ClickHouse #7534

Open taiyang-li opened 1 month ago

taiyang-li commented 1 month ago

Description

In the old version, min_count_to_compile_expression was 3, which meant that each executor had to encounter the same CompileDAG more than three times before it could be compiled, which is difficult to meet in spark execution. As a result, the JIT in the old version was almost ineffective

Users can control whether JIT is available through spark conf. By default compile_expressions is true, and min_count_to_compile_expression is 0

taiyang-li commented 1 month ago

Compare expression execution performance without or with JIT

ClickHouse is compiled with patches from https://github.com/ClickHouse/ClickHouse/pull/70598 gluten is compiled with branch in pr: https://github.com/apache/incubator-gluten/pull/7536

Query

subquery extracted from d_5231_0.sql

select day, avg(if(totaltime = 60 and frame_sent>0 and frame_sent/60<60,frame_sent/60,
                      if(totaltime <> 60 and frame_sent>0 and frame_sent/totaltime< 60,frame_sent/totaltime,null))) as avg_enc_frame_rate
            from
            (
                select day, uid 
                ,cast(other_unfixed_para['1018'] as bigint)-cast(other_unfixed_para['1019'] as bigint)-cast(other_unfixed_para['1020'] as bigint) as frame_sent
                ,if((other_unfixed_para['65'] = 4294967295 OR other_unfixed_para['65'] is null OR other_unfixed_para['65']%60=0),60,other_unfixed_para['65']%60) as totaltime
                from bigolive.live_sdk_video_stats_event_all
                where day='${day}'
                and cast ((cast(other_unfixed_para['1036'] as bigint) & 2) as bigint) =2
                and (other_unfixed_para['65']>0 or other_unfixed_para['65'] is null)
                and if(((cast(other_unfixed_para['1036'] as bigint) & 524288) /524288)=1 and other_unfixed_para['65'] = 4294967295,0,1)=1
                and cast(other_unfixed_para['92'] as bigint)>0
                and cast(other_unfixed_para['1018'] as bigint)>= 0 and cast(other_unfixed_para['1019'] as bigint)>= 0 and cast(other_unfixed_para['1020'] as bigint)>= 0
            ) tt
            group by day 

Metrics without JIT

image

image

Metrics with JIT

image

image

taiyang-li commented 1 month ago

Some useful logs during JIT execution

Expressions before compilation

2024-10-14 16:01:27.676 <Debug> ExpressionActions: Actions before compilation: 0 : INPUT () (no column) Nullable(String) uid
1 : INPUT () (no column) Nullable(Map(String, Nullable(String))) other_unfixed_para
2 : INPUT () (no column) Nullable(String) day
3 : FUNCTION (1) (no column) UInt8 isNotNull(other_unfixed_para) [isNotNull]
4 : COLUMN () Const(String) String 1036_0
5 : FUNCTION (1, 4) (no column) Nullable(String) arrayElement(other_unfixed_para,1036_0) [arrayElement]
6 : FUNCTION (5) (no column) Nullable(String) trim(arrayElement(other_unfixed_para,1036_0)) [trimBoth]
7 : COLUMN () Const(String) String Nullable(I_1
8 : FUNCTION (6, 7) (no column) Nullable(Int64) CAST(trim(arrayElement(other_unfixed_para,1036_0)),Nullable(I_1) [CAST]
9 : COLUMN () Const(Int64) Int64 2_2
10 : FUNCTION (8, 9) (no column) Nullable(Int64) bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_0)),Nullable(I_1),2_2) [bitAnd]
11 : COLUMN () Const(Int64) Int64 2_3
12 : FUNCTION (10, 11) (no column) Nullable(UInt8) equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_0)),Nullable(I_1),2_2),2_3) [equals]
13 : FUNCTION (3, 12) (no column) Nullable(UInt8) and(isNotNull(other_unfixed_para),equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_0)),Nullable(I_1),2_2),2_3)) [and]
14 : COLUMN () Const(String) String 65_4
15 : FUNCTION (1, 14) (no column) Nullable(String) arrayElement(other_unfixed_para,65_4) [arrayElement]
16 : FUNCTION (15) (no column) Nullable(String) trim(arrayElement(other_unfixed_para,65_4)) [trimBoth]
17 : COLUMN () Const(String) String Nullable(I_5
18 : FUNCTION (16, 17) (no column) Nullable(Int64) CAST(trim(arrayElement(other_unfixed_para,65_4)),Nullable(I_5) [CAST]
19 : COLUMN () Const(Int64) Int64 4294967295_6
20 : FUNCTION (18, 19) (no column) Nullable(UInt8) equals(CAST(trim(arrayElement(other_unfixed_para,65_4)),Nullable(I_5),4294967295_6) [equals]
21 : COLUMN () Const(String) String 65_7
22 : FUNCTION (1, 21) (no column) Nullable(String) arrayElement(other_unfixed_para,65_7) [arrayElement]
23 : COLUMN () Const(String) String Nullable(F_8
24 : FUNCTION (22, 23) (no column) Nullable(Float64) CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8) [CAST]
25 : COLUMN () Const(Float64) Float64 60_9
26 : FUNCTION (24, 25) (no column) Nullable(Float64) modulo(CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8),60_9) [modulo]
27 : COLUMN () Const(Float64) Float64 0_10
28 : FUNCTION (26, 27) (no column) Nullable(UInt8) equals(modulo(CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8),60_9),0_10) [equals]
29 : FUNCTION (20, 28) (no column) Nullable(UInt8) or(equals(CAST(trim(arrayElement(other_unfixed_para,65_4)),Nullable(I_5),4294967295_6),equals(modulo(CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8),60_9),0_10)) [or]
30 : COLUMN () Const(String) String 65_11
31 : FUNCTION (1, 30) (no column) Nullable(String) arrayElement(other_unfixed_para,65_11) [arrayElement]
32 : COLUMN () Const(String) String Nullable(F_12
33 : FUNCTION (31, 32) (no column) Nullable(Float64) CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12) [CAST]
34 : COLUMN () Const(Float64) Float64 60_13
35 : FUNCTION (33, 34) (no column) Nullable(Float64) modulo(CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12),60_13) [modulo]
36 : COLUMN () Const(Float64) Float64 30_14
37 : FUNCTION (35, 36) (no column) Nullable(UInt8) greaterOrEquals(modulo(CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12),60_13),30_14) [greaterOrEquals]
38 : COLUMN () Const(String) String 65_15
39 : FUNCTION (1, 38) (no column) Nullable(String) arrayElement(other_unfixed_para,65_15) [arrayElement]
40 : FUNCTION (39) (no column) UInt8 isNull(arrayElement(other_unfixed_para,65_15)) [isNull]
41 : FUNCTION (37, 40) (no column) Nullable(UInt8) or(greaterOrEquals(modulo(CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12),60_13),30_14),isNull(arrayElement(other_unfixed_para,65_15))) [or]
42 : FUNCTION (29, 41) (no column) Nullable(UInt8) or(or(equals(CAST(trim(arrayElement(other_unfixed_para,65_4)),Nullable(I_5),4294967295_6),equals(modulo(CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8),60_9),0_10)),or(greaterOrEquals(modulo(CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12),60_13),30_14),isNull(arrayElement(other_unfixed_para,65_15)))) [or]
43 : FUNCTION (13, 42) (no column) Nullable(UInt8) and(and(isNotNull(other_unfixed_para),equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_0)),Nullable(I_1),2_2),2_3)),or(or(equals(CAST(trim(arrayElement(other_unfixed_para,65_4)),Nullable(I_5),4294967295_6),equals(modulo(CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8),60_9),0_10)),or(greaterOrEquals(modulo(CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12),60_13),30_14),isNull(arrayElement(other_unfixed_para,65_15))))) [and]
44 : COLUMN () Const(String) String 1511_16
45 : FUNCTION (1, 44) (no column) Nullable(String) arrayElement(other_unfixed_para,1511_16) [arrayElement]
46 : FUNCTION (45) (no column) UInt8 isNull(arrayElement(other_unfixed_para,1511_16)) [isNull]
47 : FUNCTION (43, 46) (no column) Nullable(UInt8) and(and(and(isNotNull(other_unfixed_para),equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_0)),Nullable(I_1),2_2),2_3)),or(or(equals(CAST(trim(arrayElement(other_unfixed_para,65_4)),Nullable(I_5),4294967295_6),equals(modulo(CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8),60_9),0_10)),or(greaterOrEquals(modulo(CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12),60_13),30_14),isNull(arrayElement(other_unfixed_para,65_15))))),isNull(arrayElement(other_unfixed_para,1511_16))) [and]
48 : COLUMN () Const(String) String 1036_17
49 : FUNCTION (1, 48) (no column) Nullable(String) arrayElement(other_unfixed_para,1036_17) [arrayElement]
50 : FUNCTION (49) (no column) Nullable(String) trim(arrayElement(other_unfixed_para,1036_17)) [trimBoth]
51 : COLUMN () Const(String) String Nullable(I_18
52 : FUNCTION (50, 51) (no column) Nullable(Int64) CAST(trim(arrayElement(other_unfixed_para,1036_17)),Nullable(I_18) [CAST]
53 : COLUMN () Const(Int64) Int64 262144_19
54 : FUNCTION (52, 53) (no column) Nullable(Int64) bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_17)),Nullable(I_18),262144_19) [bitAnd]
55 : COLUMN () Const(Int64) Int64 0_20
56 : FUNCTION (54, 55) (no column) Nullable(UInt8) equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_17)),Nullable(I_18),262144_19),0_20) [equals]
57 : FUNCTION (47, 56) (no column) Nullable(UInt8) and(and(and(and(isNotNull(other_unfixed_para),equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_0)),Nullable(I_1),2_2),2_3)),or(or(equals(CAST(trim(arrayElement(other_unfixed_para,65_4)),Nullable(I_5),4294967295_6),equals(modulo(CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8),60_9),0_10)),or(greaterOrEquals(modulo(CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12),60_13),30_14),isNull(arrayElement(other_unfixed_para,65_15))))),isNull(arrayElement(other_unfixed_para,1511_16))),equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_17)),Nullable(I_18),262144_19),0_20)) [and]
58 : FUNCTION (0) (no column) UInt8 isNotNull(uid) [isNotNull]
59 : FUNCTION (57, 58) (no column) Nullable(UInt8) and(and(and(and(and(isNotNull(other_unfixed_para),equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_0)),Nullable(I_1),2_2),2_3)),or(or(equals(CAST(trim(arrayElement(other_unfixed_para,65_4)),Nullable(I_5),4294967295_6),equals(modulo(CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8),60_9),0_10)),or(greaterOrEquals(modulo(CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12),60_13),30_14),isNull(arrayElement(other_unfixed_para,65_15))))),isNull(arrayElement(other_unfixed_para,1511_16))),equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_17)),Nullable(I_18),262144_19),0_20)),isNotNull(uid)) [and]
Output nodes: 0 1 2 59
 with 0 lazy_executed_nodes

Function before compilation

and(
    and(
        and(
            and(
                and(
                    isNotNull(other_unfixed_para),
                    equals(
                        bitAnd(
                            CAST(trim(arrayElement(other_unfixed_para, 1036_0)), Nullable(I_1)),
                            2_2
                        ),
                        2_3
                    )
                ),
                or(
                    or(
                        equals(
                            CAST(trim(arrayElement(other_unfixed_para, 65_4)), Nullable(I_5)),
                            4294967295_6
                        ),
                        equals(
                            modulo(
                                CAST(arrayElement(other_unfixed_para, 65_7), Nullable(F_8)),
                                60_9
                            ),
                            0_10
                        )
                    ),
                    or(
                        greaterOrEquals(
                            modulo(
                                CAST(arrayElement(other_unfixed_para, 65_11), Nullable(F_12)),
                                60_13
                            ),
                            30_14
                        ),
                        isNull(arrayElement(other_unfixed_para, 65_15))
                    )
                )
            ),
            isNull(arrayElement(other_unfixed_para, 1511_16))
        ),
        equals(
            bitAnd(
                CAST(trim(arrayElement(other_unfixed_para, 1036_17)), Nullable(I_18)),
                262144_19
            ),
            0_20
        )
    ),
    isNotNull(uid)
)

Expression after compilation

2024-10-14 16:01:27.753 <Debug> ExpressionActions: Actions after compilation: 0 : INPUT () (no column) Nullable(String) uid
1 : INPUT () (no column) Nullable(Map(String, Nullable(String))) other_unfixed_para
2 : INPUT () (no column) Nullable(String) day
3 : FUNCTION (1) (no column) UInt8 isNotNull(other_unfixed_para) [isNotNull]
4 : COLUMN () Const(String) String 1036_0
5 : FUNCTION (1, 4) (no column) Nullable(String) arrayElement(other_unfixed_para,1036_0) [arrayElement]
6 : FUNCTION (5) (no column) Nullable(String) trim(arrayElement(other_unfixed_para,1036_0)) [trimBoth]
7 : COLUMN () Const(String) String Nullable(I_1
8 : FUNCTION (6, 7) (no column) Nullable(Int64) CAST(trim(arrayElement(other_unfixed_para,1036_0)),Nullable(I_1) [CAST]
9 : COLUMN () Const(String) String 65_4
10 : FUNCTION (1, 9) (no column) Nullable(String) arrayElement(other_unfixed_para,65_4) [arrayElement]
11 : FUNCTION (10) (no column) Nullable(String) trim(arrayElement(other_unfixed_para,65_4)) [trimBoth]
12 : COLUMN () Const(String) String Nullable(I_5
13 : FUNCTION (11, 12) (no column) Nullable(Int64) CAST(trim(arrayElement(other_unfixed_para,65_4)),Nullable(I_5) [CAST]
14 : COLUMN () Const(String) String 65_7
15 : FUNCTION (1, 14) (no column) Nullable(String) arrayElement(other_unfixed_para,65_7) [arrayElement]
16 : COLUMN () Const(String) String Nullable(F_8
17 : FUNCTION (15, 16) (no column) Nullable(Float64) CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8) [CAST]
18 : COLUMN () Const(Float64) Float64 60_9
19 : FUNCTION (17, 18) (no column) Nullable(Float64) modulo(CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8),60_9) [modulo]
20 : COLUMN () Const(String) String 65_11
21 : FUNCTION (1, 20) (no column) Nullable(String) arrayElement(other_unfixed_para,65_11) [arrayElement]
22 : COLUMN () Const(String) String Nullable(F_12
23 : FUNCTION (21, 22) (no column) Nullable(Float64) CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12) [CAST]
24 : COLUMN () Const(Float64) Float64 60_13
25 : FUNCTION (23, 24) (no column) Nullable(Float64) modulo(CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12),60_13) [modulo]
26 : COLUMN () Const(String) String 65_15
27 : FUNCTION (1, 26) (no column) Nullable(String) arrayElement(other_unfixed_para,65_15) [arrayElement]
28 : FUNCTION (27) (no column) UInt8 isNull(arrayElement(other_unfixed_para,65_15)) [isNull]
29 : COLUMN () Const(String) String 1511_16
30 : FUNCTION (1, 29) (no column) Nullable(String) arrayElement(other_unfixed_para,1511_16) [arrayElement]
31 : FUNCTION (30) (no column) UInt8 isNull(arrayElement(other_unfixed_para,1511_16)) [isNull]
32 : COLUMN () Const(String) String 1036_17
33 : FUNCTION (1, 32) (no column) Nullable(String) arrayElement(other_unfixed_para,1036_17) [arrayElement]
34 : FUNCTION (33) (no column) Nullable(String) trim(arrayElement(other_unfixed_para,1036_17)) [trimBoth]
35 : COLUMN () Const(String) String Nullable(I_18
36 : FUNCTION (34, 35) (no column) Nullable(Int64) CAST(trim(arrayElement(other_unfixed_para,1036_17)),Nullable(I_18) [CAST]
37 : FUNCTION (0) (no column) UInt8 isNotNull(uid) [isNotNull]
38 : FUNCTION (3, 8, 13, 19, 25, 28, 31, 36, 37) (no column) Nullable(UInt8) and(and(and(and(and(isNotNull(other_unfixed_para),equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_0)),Nullable(I_1),2_2),2_3)),or(or(equals(CAST(trim(arrayElement(other_unfixed_para,65_4)),Nullable(I_5),4294967295_6),equals(modulo(CAST(arrayElement(other_unfixed_para,65_7),Nullable(F_8),60_9),0_10)),or(greaterOrEquals(modulo(CAST(arrayElement(other_unfixed_para,65_11),Nullable(F_12),60_13),30_14),isNull(arrayElement(other_unfixed_para,65_15))))),isNull(arrayElement(other_unfixed_para,1511_16))),equals(bitAnd(CAST(trim(arrayElement(other_unfixed_para,1036_17)),Nullable(I_18),262144_19),0_20)),isNotNull(uid)) [and(and(and(and(and(UInt8, equals(bitAnd(Nullable(Int64), 2 : Int64), 2 : Int64)), or(or(equals(Nullable(Int64), 4294967295 : Int64), equals(Nullable(Float64), 0. : Float64)), or(greaterOrEquals(Nullable(Float64), 30. : Float64), UInt8))), UInt8), equals(bitAnd(Nullable(Int64), 262144 : Int64), 0 : Int64)), UInt8)] [compiled]
Output nodes: 0 1 2 38 

Function after compilation

and(
    and(
        and(
            and(
                and(
                    UInt8,
                    equals(
                        bitAnd(Nullable(Int64), 2 : Int64),
                        2 : Int64
                    )
                ),
                or(
                    or(
                        equals(Nullable(Int64), 4294967295 : Int64),
                        equals(Nullable(Float64), 0.0 : Float64)
                    ),
                    or(
                        greaterOrEquals(Nullable(Float64), 30.0 : Float64),
                        UInt8
                    )
                )
            ),
            UInt8
        ),
        equals(
            bitAnd(Nullable(Int64), 262144 : Int64),
            0 : Int64
        )
    ),
    UInt8
)

Task performance counters

---------------------Task Performance Counters-----------------------------
CompileFunction                                   |1                    | (Number of times a compilation of generated LLVM code (to create fused function for complex expressions) was initiated.)
CompiledFunctionExecute                           |12                   | (Number of times a compiled function was executed.)
CompileExpressionsMicroseconds                    |34682                | (Total time spent for compilation of expressions to LLVM code.)
CompileExpressionsBytes                           |8192                 | (Number of bytes used for expressions compilation.)