We use the cluster with 1 master and 3 workers. Each worker contains 128 vcores and 512GB DRAM. Vanilla spark can successfully run TPC-DS F10TB while gazelle will fail during running q14a,b. If we increase spark.sql.shuffle.partitions to avoid sort spill, the executors will be killed thanks to lacking offheap memory. If we decrease spark.sql.execution.sort.spillThreshold to activate sort spill, we will meet hanging issue.
We use the cluster with 1 master and 3 workers. Each worker contains 128 vcores and 512GB DRAM. Vanilla spark can successfully run TPC-DS F10TB while gazelle will fail during running q14a,b. If we increase spark.sql.shuffle.partitions to avoid sort spill, the executors will be killed thanks to lacking offheap memory. If we decrease spark.sql.execution.sort.spillThreshold to activate sort spill, we will meet hanging issue.