Remove partition columns from Z Order optimization in `io.py`

R7L208 commented 2 years ago

In io.py we Z ORDER on partitionCols + optimizationCols when useDeltaOpt is True. Since we can partition prune without Z Ordering on partition columns, I believe it makes sense to remove them from the Z Order clause to only optimize on optimizationCols if they are provided.

Is there another advantage to including partition columns within Z ORDER for time series other than data skipping?

rportilla-databricks commented 2 years ago

We should be able to remove partition columns.

On Mon, Aug 15, 2022 at 11:41 AM Lorin Dawson @.***> wrote:

In io.py we Z ORDER on partitionCols + optimizationCols when useDeltaOpt is True. Since we can partition prune without Z Ordering on partition columns, I believe it makes sense to remove them from the Z Order clause to only optimize on optimizationCols if they are provided.

Is there another advantage to including partition columns within Z ORDER for time series other than data skipping?

— Reply to this email directly, view it on GitHub https://github.com/databrickslabs/tempo/issues/244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJCRAXABDIJ75JM6UKPSXWLVZJQLLANCNFSM56SUGQBQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Ricardo Portilla

Industry Vertical Lead - Financial Services, Ph.D

Databricks Inc.

@.***

databricks.com

rportilla-databricks commented 1 year ago

This will be resolved when streaming AS OF joins are merged.

databrickslabs / tempo

Remove partition columns from Z Order optimization in `io.py` #244