Closed andygrove closed 2 weeks ago
microbenchmark results @ sf=1
AMD Ryzen 9 7950X3D 16-Core Processor
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
add_many_decimals 809 872 67 3.6 281.1 1.0X
add_many_decimals 770 788 16 3.7 267.5 1.1X
add_many_decimals: Comet (Scan) 930 952 38 3.1 323.0 0.9X
add_many_decimals: Comet (Scan, Exec) 2021 2030 12 1.4 701.9 0.4X
Benchmark runs @ sf=100 suggest that reading decimal from parquet could potentially be a performance issue.
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
add_many_decimals 20502 20648 208 14.0 71.2 1.0X
add_many_decimals 20498 20544 65 14.1 71.2 1.0X
add_many_decimals: Comet (Scan) 28143 28161 26 10.2 97.7 0.7X
add_many_decimals: Comet (Scan, Exec) 19323 19497 246 14.9 67.1 1.1X
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
agg_sum_decimals_no_grouping 10552 10583 44 27.3 36.6 1.0X
agg_sum_decimals_no_grouping 10406 10450 61 27.7 36.1 1.0X
agg_sum_decimals_no_grouping: Comet (Scan) 46013 46278 375 6.3 159.8 0.2X
agg_sum_decimals_no_grouping: Comet (Scan, Exec) 13840 13956 164 20.8 48.1 0.8X
33 iterations of sf=1
So somehow once Sum is applied, Comet scan slows down...
I looks like once Comet scan is enabled GangWoker
uses more time.
33 iterations of sf=1
Pure scan without doing sum(decimals)
Scan with doing Spark sum(decimals)
So somehow once Sum is applied, Comet scan slows down...
I think these two screen shots are identical?
I think these two screen shots are identical?
Thanks @andygrove updated to the correct one
After #741 there are still some issues with if
and case
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
add_many_decimals 11255 11385 184 25.6 39.1 1.0X
add_many_decimals: Comet (Scan) 14372 14545 245 20.0 49.9 0.8X
add_many_decimals: Comet (Scan, Exec) 9846 9933 123 29.3 34.2 1.1X
Running benchmark: TPCDS Micro Benchmarks
Running case: add_many_integers
Stopped after 2 iterations, 7870 ms
Running case: add_many_integers: Comet (Scan)
Stopped after 2 iterations, 6910 ms
Running case: add_many_integers: Comet (Scan, Exec)
Stopped after 2 iterations, 7138 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
add_many_integers 3866 3935 98 74.5 13.4 1.0X
add_many_integers: Comet (Scan) 3450 3455 7 83.5 12.0 1.1X
add_many_integers: Comet (Scan, Exec) 3548 3569 30 81.2 12.3 1.1X
Running benchmark: TPCDS Micro Benchmarks
Running case: agg_high_cardinality
Stopped after 2 iterations, 3620 ms
Running case: agg_high_cardinality: Comet (Scan)
Stopped after 2 iterations, 5411 ms
Running case: agg_high_cardinality: Comet (Scan, Exec)
Stopped after 2 iterations, 2126 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
agg_high_cardinality 1769 1810 59 40.7 24.6 1.0X
agg_high_cardinality: Comet (Scan) 2642 2706 90 27.3 36.7 0.7X
agg_high_cardinality: Comet (Scan, Exec) 1060 1063 4 67.9 14.7 1.7X
Running benchmark: TPCDS Micro Benchmarks
Running case: agg_low_cardinality
Stopped after 5 iterations, 2032 ms
Running case: agg_low_cardinality: Comet (Scan)
Stopped after 3 iterations, 2179 ms
Running case: agg_low_cardinality: Comet (Scan, Exec)
Stopped after 7 iterations, 2089 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
agg_low_cardinality 371 406 34 194.2 5.1 1.0X
agg_low_cardinality: Comet (Scan) 717 727 13 100.4 10.0 0.5X
agg_low_cardinality: Comet (Scan, Exec) 278 298 19 258.7 3.9 1.3X
Running benchmark: TPCDS Micro Benchmarks
Running case: agg_sum_decimals_no_grouping
Stopped after 2 iterations, 14633 ms
Running case: agg_sum_decimals_no_grouping: Comet (Scan)
Stopped after 2 iterations, 78280 ms
Running case: agg_sum_decimals_no_grouping: Comet (Scan, Exec)
Stopped after 2 iterations, 16892 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
agg_sum_decimals_no_grouping 7251 7317 93 39.7 25.2 1.0X
agg_sum_decimals_no_grouping: Comet (Scan) 38946 39140 274 7.4 135.2 0.2X
agg_sum_decimals_no_grouping: Comet (Scan, Exec) 8407 8446 55 34.3 29.2 0.9X
Running benchmark: TPCDS Micro Benchmarks
Running case: agg_sum_integers_no_grouping
Stopped after 2 iterations, 8131 ms
Running case: agg_sum_integers_no_grouping: Comet (Scan)
Stopped after 2 iterations, 8375 ms
Running case: agg_sum_integers_no_grouping: Comet (Scan, Exec)
Stopped after 2 iterations, 9179 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
agg_sum_integers_no_grouping 3914 4066 215 73.6 13.6 1.0X
agg_sum_integers_no_grouping: Comet (Scan) 4079 4188 154 70.6 14.2 1.0X
agg_sum_integers_no_grouping: Comet (Scan, Exec) 4557 4590 47 63.2 15.8 0.9X
Running benchmark: TPCDS Micro Benchmarks
Running case: case_when_column_or_null
Stopped after 2 iterations, 3114 ms
Running case: case_when_column_or_null: Comet (Scan)
Stopped after 2 iterations, 5693 ms
Running case: case_when_column_or_null: Comet (Scan, Exec)
Stopped after 2 iterations, 3415 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
----------------------------------------------------------------------------------------------------------------------------
case_when_column_or_null 1440 1557 166 200.1 5.0 1.0X
case_when_column_or_null: Comet (Scan) 2832 2847 20 101.7 9.8 0.5X
case_when_column_or_null: Comet (Scan, Exec) 1691 1708 23 170.3 5.9 0.9X
Running benchmark: TPCDS Micro Benchmarks
Running case: case_when_scalar
Stopped after 9 iterations, 2090 ms
Running case: case_when_scalar: Comet (Scan)
Stopped after 2 iterations, 2312 ms
Running case: case_when_scalar: Comet (Scan, Exec)
Stopped after 5 iterations, 2048 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
case_when_scalar 196 232 51 367.0 2.7 1.0X
case_when_scalar: Comet (Scan) 1133 1156 33 63.5 15.7 0.2X
case_when_scalar: Comet (Scan, Exec) 376 410 38 191.5 5.2 0.5X
Running benchmark: TPCDS Micro Benchmarks
Running case: filter_highly_selective
Stopped after 11 iterations, 2132 ms
Running case: filter_highly_selective: Comet (Scan)
Stopped after 3 iterations, 2233 ms
Running case: filter_highly_selective: Comet (Scan, Exec)
Stopped after 10 iterations, 2054 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
filter_highly_selective 154 194 85 466.1 2.1 1.0X
filter_highly_selective: Comet (Scan) 734 745 9 98.0 10.2 0.2X
filter_highly_selective: Comet (Scan, Exec) 156 205 108 462.3 2.2 1.0X
Running benchmark: TPCDS Micro Benchmarks
Running case: filter_less_selective
Stopped after 7 iterations, 2091 ms
Running case: filter_less_selective: Comet (Scan)
Stopped after 3 iterations, 2093 ms
Running case: filter_less_selective: Comet (Scan, Exec)
Stopped after 10 iterations, 2272 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------
filter_less_selective 165 299 165 435.4 2.3 1.0X
filter_less_selective: Comet (Scan) 691 698 11 104.3 9.6 0.2X
filter_less_selective: Comet (Scan, Exec) 193 227 39 373.7 2.7 0.9X
Running benchmark: TPCDS Micro Benchmarks
Running case: if_column_or_null
Stopped after 2 iterations, 2630 ms
Running case: if_column_or_null: Comet (Scan)
Stopped after 2 iterations, 3102 ms
Running case: if_column_or_null: Comet (Scan, Exec)
Stopped after 2 iterations, 5368 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
if_column_or_null 1315 1315 0 219.1 4.6 1.0X
if_column_or_null: Comet (Scan) 1518 1551 46 189.8 5.3 0.9X
if_column_or_null: Comet (Scan, Exec) 2606 2684 111 110.5 9.0 0.5X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_anti
Stopped after 2 iterations, 12404 ms
Running case: join_anti: Comet (Scan)
Stopped after 2 iterations, 12023 ms
Running case: join_anti: Comet (Scan, Exec)
[528.105s][warning][gc,alloc] Executor task launch worker for task 2.0 in stage 802.0 (TID 14773): Retried waiting for GCLocker too often allocating 134217730 words
Stopped after 2 iterations, 11999 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
join_anti 6128 6202 104 11.7 85.1 1.0X
join_anti: Comet (Scan) 5774 6012 335 12.5 80.2 1.1X
join_anti: Comet (Scan, Exec) 5971 6000 41 12.1 82.9 1.0X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_condition
Stopped after 2 iterations, 4255 ms
Running case: join_condition: Comet (Scan)
Stopped after 2 iterations, 3245 ms
Running case: join_condition: Comet (Scan, Exec)
Stopped after 2 iterations, 3721 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
join_condition 2055 2128 103 264.4 3.8 1.0X
join_condition: Comet (Scan) 1534 1623 125 354.2 2.8 1.3X
join_condition: Comet (Scan, Exec) 1808 1861 75 300.6 3.3 1.1X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_exploding_output
Stopped after 2 iterations, 3425 ms
Running case: join_exploding_output: Comet (Scan)
Stopped after 2 iterations, 3004 ms
Running case: join_exploding_output: Comet (Scan, Exec)
Stopped after 2 iterations, 3495 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------
join_exploding_output 1603 1713 155 339.0 2.9 1.0X
join_exploding_output: Comet (Scan) 1413 1502 126 384.5 2.6 1.1X
join_exploding_output: Comet (Scan, Exec) 1722 1748 37 315.6 3.2 0.9X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_inner
Stopped after 4 iterations, 2194 ms
Running case: join_inner: Comet (Scan)
Stopped after 5 iterations, 2324 ms
Running case: join_inner: Comet (Scan, Exec)
Stopped after 3 iterations, 2110 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
join_inner 512 549 36 562.2 1.8 1.0X
join_inner: Comet (Scan) 462 465 3 623.9 1.6 1.1X
join_inner: Comet (Scan, Exec) 694 703 10 415.1 2.4 0.7X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_left_outer
Stopped after 2 iterations, 192095 ms
Running case: join_left_outer: Comet (Scan)
Stopped after 2 iterations, 192225 ms
Running case: join_left_outer: Comet (Scan, Exec)
Stopped after 2 iterations, 191166 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
join_left_outer 95996 96048 73 3.3 303.0 1.0X
join_left_outer: Comet (Scan) 95378 96113 1038 3.3 301.1 1.0X
join_left_outer: Comet (Scan, Exec) 95377 95583 292 3.3 301.1 1.0X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_semi
[1448.983s][warning][gc,alloc] Executor task launch worker for task 0.0 in stage 1153.0 (TID 20715): Retried waiting for GCLocker too often allocating 134217730 words
[1463.584s][warning][gc,alloc] Executor task launch worker for task 2.0 in stage 1159.0 (TID 20772): Retried waiting for GCLocker too often allocating 134217730 words
[1476.740s][warning][gc,alloc] Executor task launch worker for task 0.0 in stage 1165.0 (TID 20825): Retried waiting for GCLocker too often allocating 134217730 words
[1476.754s][warning][gc,alloc] Executor task launch worker for task 1.0 in stage 1165.0 (TID 20826): Retried waiting for GCLocker too often allocating 134217730 words
Stopped after 2 iterations, 26371 ms
Running case: join_semi: Comet (Scan)
[1490.216s][warning][gc,alloc] Executor task launch worker for task 2.0 in stage 1171.0 (TID 20882): Retried waiting for GCLocker too often allocating 134217730 words
[1515.531s][warning][gc,alloc] Executor task launch worker for task 1.0 in stage 1183.0 (TID 20991): Retried waiting for GCLocker too often allocating 134217730 words
Stopped after 2 iterations, 25521 ms
Running case: join_semi: Comet (Scan, Exec)
[1541.053s][warning][gc,alloc] Executor task launch worker for task 2.0 in stage 1195.0 (TID 21102): Retried waiting for GCLocker too often allocating 134217730 words
Stopped after 2 iterations, 25372 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
join_semi 13164 13186 30 5.5 182.8 1.0X
join_semi: Comet (Scan) 12522 12761 338 5.8 173.9 1.1X
join_semi: Comet (Scan, Exec) 12276 12686 580 5.9 170.5 1.1X
After merging with latest main
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
add_many_decimals 11828 12330 709 24.3 41.1 1.0X
add_many_decimals: Comet (Scan) 15061 15181 171 19.1 52.3 0.8X
add_many_decimals: Comet (Scan, Exec) 12400 12885 686 23.2 43.1 1.0X
Running benchmark: TPCDS Micro Benchmarks
Running case: add_many_integers
Stopped after 2 iterations, 7415 ms
Running case: add_many_integers: Comet (Scan)
Stopped after 2 iterations, 6969 ms
Running case: add_many_integers: Comet (Scan, Exec)
Stopped after 2 iterations, 7371 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
add_many_integers 3590 3708 166 80.2 12.5 1.0X
add_many_integers: Comet (Scan) 3467 3485 24 83.1 12.0 1.0X
add_many_integers: Comet (Scan, Exec) 3659 3686 38 78.7 12.7 1.0X
Running benchmark: TPCDS Micro Benchmarks
Running case: agg_high_cardinality
Stopped after 2 iterations, 3604 ms
Running case: agg_high_cardinality: Comet (Scan)
Stopped after 2 iterations, 5400 ms
Running case: agg_high_cardinality: Comet (Scan, Exec)
Stopped after 2 iterations, 2131 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
agg_high_cardinality 1793 1802 13 40.2 24.9 1.0X
agg_high_cardinality: Comet (Scan) 2683 2700 24 26.8 37.3 0.7X
agg_high_cardinality: Comet (Scan, Exec) 1056 1066 14 68.2 14.7 1.7X
Running benchmark: TPCDS Micro Benchmarks
Running case: agg_low_cardinality
Stopped after 6 iterations, 2078 ms
Running case: agg_low_cardinality: Comet (Scan)
Stopped after 3 iterations, 2226 ms
Running case: agg_low_cardinality: Comet (Scan, Exec)
Stopped after 8 iterations, 2066 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
agg_low_cardinality 333 346 10 216.5 4.6 1.0X
agg_low_cardinality: Comet (Scan) 730 742 14 98.7 10.1 0.5X
agg_low_cardinality: Comet (Scan, Exec) 252 258 7 286.1 3.5 1.3X
Running benchmark: TPCDS Micro Benchmarks
Running case: agg_sum_decimals_no_grouping
Stopped after 2 iterations, 15998 ms
Running case: agg_sum_decimals_no_grouping: Comet (Scan)
Stopped after 2 iterations, 87493 ms
Running case: agg_sum_decimals_no_grouping: Comet (Scan, Exec)
Stopped after 2 iterations, 16900 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
agg_sum_decimals_no_grouping 7828 7999 243 36.8 27.2 1.0X
agg_sum_decimals_no_grouping: Comet (Scan) 42539 43747 1708 6.8 147.7 0.2X
agg_sum_decimals_no_grouping: Comet (Scan, Exec) 8441 8450 12 34.1 29.3 0.9X
Running benchmark: TPCDS Micro Benchmarks
Running case: agg_sum_integers_no_grouping
Stopped after 2 iterations, 7564 ms
Running case: agg_sum_integers_no_grouping: Comet (Scan)
Stopped after 2 iterations, 8300 ms
Running case: agg_sum_integers_no_grouping: Comet (Scan, Exec)
Stopped after 2 iterations, 9181 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------
agg_sum_integers_no_grouping 3761 3782 30 76.6 13.1 1.0X
agg_sum_integers_no_grouping: Comet (Scan) 4125 4150 36 69.8 14.3 0.9X
agg_sum_integers_no_grouping: Comet (Scan, Exec) 4523 4591 96 63.7 15.7 0.8X
Running benchmark: TPCDS Micro Benchmarks
Running case: case_when_column_or_null
Stopped after 2 iterations, 2203 ms
Running case: case_when_column_or_null: Comet (Scan)
Stopped after 2 iterations, 5703 ms
Running case: case_when_column_or_null: Comet (Scan, Exec)
Stopped after 2 iterations, 3428 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
----------------------------------------------------------------------------------------------------------------------------
case_when_column_or_null 1099 1102 4 262.1 3.8 1.0X
case_when_column_or_null: Comet (Scan) 2827 2852 35 101.9 9.8 0.4X
case_when_column_or_null: Comet (Scan, Exec) 1681 1714 47 171.3 5.8 0.7X
Running benchmark: TPCDS Micro Benchmarks
Running case: case_when_scalar
Stopped after 10 iterations, 2146 ms
Running case: case_when_scalar: Comet (Scan)
Stopped after 2 iterations, 2433 ms
Running case: case_when_scalar: Comet (Scan, Exec)
Stopped after 5 iterations, 2019 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
case_when_scalar 193 215 32 373.6 2.7 1.0X
case_when_scalar: Comet (Scan) 1210 1217 9 59.5 16.8 0.2X
case_when_scalar: Comet (Scan, Exec) 378 404 44 190.4 5.3 0.5X
Running benchmark: TPCDS Micro Benchmarks
Running case: filter_highly_selective
Stopped after 13 iterations, 2221 ms
Running case: filter_highly_selective: Comet (Scan)
Stopped after 3 iterations, 2297 ms
Running case: filter_highly_selective: Comet (Scan, Exec)
Stopped after 11 iterations, 2084 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------
filter_highly_selective 145 171 32 496.7 2.0 1.0X
filter_highly_selective: Comet (Scan) 762 766 4 94.5 10.6 0.2X
filter_highly_selective: Comet (Scan, Exec) 171 190 18 420.4 2.4 0.8X
Running benchmark: TPCDS Micro Benchmarks
Running case: filter_less_selective
Stopped after 10 iterations, 2011 ms
Running case: filter_less_selective: Comet (Scan)
Stopped after 3 iterations, 2121 ms
Running case: filter_less_selective: Comet (Scan, Exec)
Stopped after 11 iterations, 2159 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------
filter_less_selective 170 201 42 422.4 2.4 1.0X
filter_less_selective: Comet (Scan) 694 707 12 103.8 9.6 0.2X
filter_less_selective: Comet (Scan, Exec) 176 196 20 409.9 2.4 1.0X
Running benchmark: TPCDS Micro Benchmarks
Running case: if_column_or_null
Stopped after 2 iterations, 2282 ms
Running case: if_column_or_null: Comet (Scan)
Stopped after 2 iterations, 3123 ms
Running case: if_column_or_null: Comet (Scan, Exec)
Stopped after 2 iterations, 3524 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
if_column_or_null 1136 1141 7 253.6 3.9 1.0X
if_column_or_null: Comet (Scan) 1473 1562 126 195.6 5.1 0.8X
if_column_or_null: Comet (Scan, Exec) 1751 1762 17 164.6 6.1 0.6X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_anti
Stopped after 2 iterations, 12717 ms
Running case: join_anti: Comet (Scan)
Stopped after 2 iterations, 11808 ms
Running case: join_anti: Comet (Scan, Exec)
Stopped after 2 iterations, 11755 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
join_anti 6230 6359 183 11.6 86.5 1.0X
join_anti: Comet (Scan) 5789 5904 163 12.4 80.4 1.1X
join_anti: Comet (Scan, Exec) 5720 5878 222 12.6 79.4 1.1X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_condition
Stopped after 2 iterations, 3323 ms
Running case: join_condition: Comet (Scan)
Stopped after 2 iterations, 3201 ms
Running case: join_condition: Comet (Scan, Exec)
Stopped after 2 iterations, 3376 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
join_condition 1568 1662 133 346.6 2.9 1.0X
join_condition: Comet (Scan) 1479 1601 172 367.4 2.7 1.1X
join_condition: Comet (Scan, Exec) 1688 1688 1 321.9 3.1 0.9X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_exploding_output
Stopped after 2 iterations, 2796 ms
Running case: join_exploding_output: Comet (Scan)
Stopped after 2 iterations, 2732 ms
Running case: join_exploding_output: Comet (Scan, Exec)
Stopped after 2 iterations, 3220 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------
join_exploding_output 1395 1398 5 389.5 2.6 1.0X
join_exploding_output: Comet (Scan) 1352 1366 20 401.8 2.5 1.0X
join_exploding_output: Comet (Scan, Exec) 1608 1610 4 338.0 3.0 0.9X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_inner
Stopped after 4 iterations, 2045 ms
Running case: join_inner: Comet (Scan)
Stopped after 5 iterations, 2359 ms
Running case: join_inner: Comet (Scan, Exec)
Stopped after 3 iterations, 2083 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
join_inner 497 511 12 579.1 1.7 1.0X
join_inner: Comet (Scan) 460 472 8 626.9 1.6 1.1X
join_inner: Comet (Scan, Exec) 687 694 8 419.2 2.4 0.7X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_left_outer
Stopped after 2 iterations, 193508 ms
Running case: join_left_outer: Comet (Scan)
[1025.721s][warning][gc,alloc] Executor task launch worker for task 1.0 in stage 1178.0 (TID 21368): Retried waiting for GCLocker too often allocating 134217730 words
Stopped after 2 iterations, 192356 ms
Running case: join_left_outer: Comet (Scan, Exec)
Stopped after 2 iterations, 191410 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
join_left_outer 96613 96754 200 3.3 305.0 1.0X
join_left_outer: Comet (Scan) 95176 96178 1418 3.3 300.4 1.0X
join_left_outer: Comet (Scan, Exec) 94835 95705 1231 3.3 299.3 1.0X
Running benchmark: TPCDS Micro Benchmarks
Running case: join_semi
[1485.285s][warning][gc,alloc] Executor task launch worker for task 3.0 in stage 1221.0 (TID 21917): Retried waiting for GCLocker too often allocating 134217730 words
[1510.652s][warning][gc,alloc] Executor task launch worker for task 3.0 in stage 1233.0 (TID 22027): Retried waiting for GCLocker too often allocating 134217730 words
Stopped after 2 iterations, 25421 ms
Running case: join_semi: Comet (Scan)
[1520.632s][warning][gc,alloc] Executor task launch worker for task 3.0 in stage 1239.0 (TID 22082): Retried waiting for GCLocker too often allocating 134217730 words
[1520.632s][warning][gc,alloc] Executor task launch worker for task 1.0 in stage 1239.0 (TID 22080): Retried waiting for GCLocker too often allocating 134217730 words
[1547.750s][warning][gc,alloc] Executor task launch worker for task 3.0 in stage 1251.0 (TID 22192): Retried waiting for GCLocker too often allocating 134217730 words
Stopped after 2 iterations, 26686 ms
Running case: join_semi: Comet (Scan, Exec)
[1573.897s][warning][gc,alloc] Executor task launch worker for task 2.0 in stage 1263.0 (TID 22301): Retried waiting for GCLocker too often allocating 134217730 words
Stopped after 2 iterations, 25646 ms
OpenJDK 64-Bit Server VM 17.0.11+9-LTS on Mac OS X 14.5
Apple M1 Max
TPCDS Micro Benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
join_semi 11986 12711 1025 6.0 166.5 1.0X
join_semi: Comet (Scan) 12957 13343 546 5.6 180.0 0.9X
join_semi: Comet (Scan, Exec) 12694 12823 183 5.7 176.3 0.9X
We created many fixes. I think decimals are no longer issues. closing for now
What is the problem the feature request solves?
SQL
Query time in seconds with Comet disabled:
With Comet enabled:
I do not see a slow down if I add all the integer columns in the table, so this seems specific to decimal.
Part of the issue here may be the creation of all of the intermediate result vectors.
Describe the potential solution
No response
Additional context
No response