apache / datafusion-comet

Apache DataFusion Comet Spark Accelerator
https://datafusion.apache.org/comet
Apache License 2.0
649 stars 119 forks source link

Infinite recursion when calling `canonicalized` on query plan for TPC-H q8 #576

Closed andygrove closed 2 weeks ago

andygrove commented 2 weeks ago

Describe the bug

Note: This was originally reported as an issue with xxhash64 in https://github.com/apache/datafusion-comet/issues/517 but xxhash64 is not really involved.

When running TPC-H q8 with xxhash64 enabled, the executor becomes unresponsive, with most cores at 100%. Running a kill -QUIT command against the process dumps the current JVM stacks and shows a 500+ line stack performing plan transformation that appears to be the issue.

See attached log.

executor-thread-dump.txt.gz

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

andygrove commented 2 weeks ago

The issue seems to be that doCanonicalized runs forever for this query.

andygrove commented 2 weeks ago

The issue goes away if I revert https://github.com/apache/datafusion-comet/pull/447 so perhaps this is related to some combination of bloom filters and reused exchanges?

viirya commented 2 weeks ago

Hmm, interesting. I will take a look.