Is your feature request related to a problem? Please describe.
If a plan contains nested unions/concats, we can instead flatten those to a single operation
Example:
df.concat(df.concat(df.concat(df)).explain(True)
which ends up looking like this.
flowchart TD
A[concat] --> B[3]
A --> C[concat]
C --> D[2]
C --> E[1]
But a more efficient representation would be
flowchart TD
A[concat] --> B[3]
A --> D[2]
A --> E[1]
Describe the solution you'd like
Inefficient queries such as the above are automatically optimized using the logic stated
Is your feature request related to a problem? Please describe. If a plan contains nested unions/concats, we can instead flatten those to a single operation
Example:
which ends up looking like this.
But a more efficient representation would be
Describe the solution you'd like Inefficient queries such as the above are automatically optimized using the logic stated
Describe alternatives you've considered None
Additional context polars - https://github.com/pola-rs/polars/issues/7855 datafusion - https://github.com/apache/datafusion/issues/7481