apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.19k stars 433 forks source link

[VL] Setting higher value of Broadcast Hash Join threshold is degrading the performance of Spark with Gluten #7548

Open adesh-rao opened 1 week ago

adesh-rao commented 1 week ago

Backend

VL (Velox)

Bug description

Setup: Running TPCDS benchmark on spark with gluten.

With increasing broadcast hash join threshold, the performance of TPCDS benchmark degrades (config: spark.sql.autoBroadcastJoinThreshold).

Config-Value TPCDS Total Time Taken
10MB 795sec
50MB 870sec
100MB 890sec

We have jobs with BHJ threshold set to 200MB in vanilla spark. Such jobs are not showing any gains with Spark+Gluten.

Spark version

Spark-3.5.x

Spark configurations

spark.sql.autoBroadcastJoinThreshold

System information

No response

Relevant logs

No response

FelixYBW commented 1 week ago

It's a known issue. The issue is Gluten/Velox broadcast raw table instead of hash table. So each task needs to build its own hash table from raw data. The solution is to build hashtable from driver or one task then broadcast. But the problem now is Velox doesn't support it yet.

we plan to fix it in next year.

weiting-chen commented 1 week ago

Add the impact TPC-DS queries: q24a, q24b, q11, q65, q69, q85 as a record.