Run Q14a.sql , q14b.sql of TPC-DS SF10TB will fail thanks to lacking of offheap memory.

oap-project / gazelle_plugin

Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.

Apache License 2.0

256 stars 76 forks source link

Run Q14a.sql , q14b.sql of TPC-DS SF10TB will fail thanks to lacking of offheap memory. #750

Open haojinIntel opened 2 years ago

haojinIntel commented 2 years ago

We use the cluster with 1 master and 3 workers. Each worker contains 128 vcores and 512GB DRAM. Vanilla spark can successfully run TPC-DS F10TB while gazelle will fail during running q14a,b. If we increase spark.sql.shuffle.partitions to avoid sort spill, the executors will be killed thanks to lacking offheap memory. If we decrease spark.sql.execution.sort.spillThreshold to activate sort spill, we will meet hanging issue.

haojinIntel commented 2 years ago

@zhouyuan Please track the issue. Thanks.