Open taiyang-li opened 2 days ago
Run Gluten Clickhouse CI on x86
Performance comparison:
CREATE TEMPORARY VIEW test_table
USING org.apache.spark.sql.parquet
OPTIONS (
path "/data1/liyang/cppproject/spark/spark-3.3.2-bin-hadoop3/bigo_live_user_event"
) ;
select
case when event.log_extra['tab_type'] in (5) then '1' else '0' end as entrance
from test_table
lateral view explode(events) as event
where event.log_extra['action'] in (13)
set spark.gluten.sql.extendedGeneratorNestedColumnAliasing = true;
No rows selected (0.546 seconds)
set spark.gluten.sql.extendedGeneratorNestedColumnAliasing = false;
No rows selected (9.326 seconds)
Run Gluten Clickhouse CI on x86
Run Gluten Clickhouse CI on x86
Run Gluten Clickhouse CI on x86
Another performance comparison on production. The change is not obvious because the pruned columns are so small. Notice the output bytes of scan operator(8.6 TB vs 8.7 TB)
Query: d_12768_1.sql
Run query with set spark.gluten.sql.extendedGeneratorNestedColumnAliasing = true;
SubstraitFileSourceStep (read local files)
Header: uid Nullable(Int64)
country Nullable(String)
events Nullable(Array(Nullable(Tuple(event_id Nullable(String), log_extra Nullable(Map(String, Nullable(String))), event_info Nullable(Map(String, Nullable(String)))))))
day Nullable(String)
Run query with set spark.gluten.sql.extendedGeneratorNestedColumnAliasing = false
SubstraitFileSourceStep (read local files)
Header: uid Nullable(Int64)
country Nullable(String)
events Nullable(Array(Nullable(Tuple(time Nullable(Int64), lng Nullable(Int64), lat Nullable(Int64), net Nullable(String), event_id Nullable(String), log_extra Nullable(Map(String, Nullable(String))), event_info Nullable(Map(String, Nullable(String)))))))
day Nullable(String)
Run Gluten Clickhouse CI on x86
Run Gluten Clickhouse CI on x86
Run Gluten Clickhouse CI on x86
Run Gluten Clickhouse CI on x86
Run Gluten Clickhouse CI on x86
Run Gluten Clickhouse CI on x86
What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
(Fixes: #3839)
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)