Closed viirya closed 1 month ago
I am testing this PR out now with benchmarks.
I am testing with TPC-H sf=100. I usually test with one executor and 8 cores, but with this PR I can only run with a single core. I tried with 2 cores with this config:
--conf spark.executor.instances=1 \
--conf spark.executor.memory=16G \
--conf spark.executor.cores=2 \
--conf spark.cores.max=2 \
--conf spark.memory.offHeap.enabled=true \
--conf spark.memory.offHeap.size=20g \
The job fails with:
org.apache.spark.SparkException:
Job aborted due to stage failure: Task 0 in stage 251.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 251.0 (TID 2171) (10.0.0.118 executor 0):
org.apache.comet.CometNativeException:
External error:
Internal error: Partition is still not able to allocate enough memory for the array builders after spilling..
I will try it with sf=100.
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 34.43%. Comparing base (
591f45a
) to head (fd78a74
). Report is 3 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks @andygrove
Which issue does this PR close?
Closes #1019.
Rationale for this change
This restore the patch merged in #988. The patch causes the issue #1019. This patch includes a fix for that.
What changes are included in this PR?
How are these changes tested?
Manually run TPCH benchmark locally.