Closed andygrove closed 1 month ago
Here is a teaser for the performance improvement. This is for TPC-H q11 (SF=100) with broadcast joins disabled (I am looking into a regression with those). I ran the query 5 times each with rule enabled vs disabled.
79.87537693977356,
77.76734256744385,
75.35734295845032,
75.44863200187683,
72.88174152374268
39.33945274353027,
36.159271240234375,
35.83299708366394,
35.638232707977295,
35.67777371406555
There is a small danger in enabling this without having a good estimate of the size of the build side. ShuffleHashJoin has limits on how much data it can process efficiently. If the build side hash table has no spilling then a large enough build side will cause OOMs and if there is spilling, then SMJ can frequently lead to better performance. We might even see this when we scale the benchmark from SF1 to say SF10. Is there a way for us to get cardinality and row size for the build side somehow? Still worth adding this option though.
if there is spilling, then SMJ can frequently lead to better performance I have seen this happen with Spark with some TPC-DS queries at SF10.
Current benchmarks:
Speedup of using HashJoin instead of SortMergeJoin:
Attention: Patch coverage is 19.44444%
with 29 lines
in your changes missing coverage. Please review.
Project coverage is 34.27%. Comparing base (
e3ac6cf
) to head (1073517
). Report is 2 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I will add documentation to this PR today, explaining pros/cons of this feature in our tuning guide.
@viirya @parthchandra This is now ready for review. The new option is disabled by default and I added a section to the tuning guide explaining why users may want to enable this new option.
I have run into a deadlock when running TPC-DS benchmarks with this feature, so I am moving to draft while I investigate. It is possibly related to the memory pool issues that we are also working on in other PRs.
After upmerging, I no longer see the deadlock, but instead get an error if I have insufficient memory allocated, which is an improvement.
org.apache.comet.CometNativeException (External error: Internal error:
Partition is still not able to allocate enough memory for the array builders after spilling..
However, when I increase memory, I see queries fail due to https://github.com/apache/datafusion-comet/issues/1019.
I have now marked the feature as experimental and explained in the tuning guide that there is no spill to disk so this could result in OOM.
Fresh benchmarks after upmerging.
TPC-DS excluding q97 (OOM with ShuffledHashJoin).
Which issue does this PR close?
Closes https://github.com/apache/datafusion-comet/issues/1006
Rationale for this change
Improved performance
What changes are included in this PR?
How are these changes tested?
I manually ran TPC-H and saw improved performance. I will post benchmarks once I have run more tests.