apache / kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
https://kyuubi.apache.org/
Apache License 2.0
2.11k stars 915 forks source link

Support save WithCTE for insertRepartitionBeforeWrite #6783

Open ic4y opened 3 weeks ago

ic4y commented 3 weeks ago

:mag: Description

Issue References ๐Ÿ”—

First, I'd like to thank @wForget for the help with this issue.

When using the "save to HDFS" feature, queries ending with an ORDER BY sometimes lose their sort order in the results. Upon investigating the code, I discovered that when using WITH statements and saving SQL results with toDF.write.save, a WithCTE node is generated after the Sort node. This causes the canInsertRepartitionByExpression check to fail, leading to an incorrect Repartition node insertion after the Sort node, which ultimately disrupts the sort order.

However, this issue does not occur when using INSERT INTO TABLE with WithCTE nodes.

The provided unit test can reproduce this issue, but after using toDF.write.save, I am unable to access the complete execution plan to assert whether a Repartition node is present. Therefore, the current test is ineffective.

Hope someone can help figure out how to write this unit test.

Describe Your Solution ๐Ÿ”ง

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Types of changes :bookmark:

Test Plan ๐Ÿงช

Behavior Without This Pull Request :coffin:

Behavior With This Pull Request :tada:

Related Unit Tests


Checklist ๐Ÿ“

Be nice. Be informative.

codecov-commenter commented 3 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 0.00%. Comparing base (d3520dd) to head (00fc1fa). Report is 2 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #6783 +/- ## ====================================== Coverage 0.00% 0.00% ====================================== Files 687 687 Lines 42439 42441 +2 Branches 5793 5793 ====================================== - Misses 42439 42441 +2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

wForget commented 2 weeks ago

There is an issue with the logic of check write in base unit test. Please wait #6793