delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.62k stars 1.71k forks source link

[Feature Request][Spark] Add test coverage for clustering on generated columns #3248

Open zedtang opened 5 months ago

zedtang commented 5 months ago

Feature request

Which Delta project/connector is this regarding?

Overview

Clustered tables support columns with stats collected on them, so generated column should work. We should add testing to cover this scenario.

Further details

We can add tests for generated columns in ClusteredTableDDLSuite.scala

Willingness to contribute

The Delta Lake Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?

wudanzy commented 4 months ago

Hi @zedtang, could you please assign this task to me, I am new to delta and would like to take this to get warmed up.

zedtang commented 4 months ago

Yeah, sure @wudanzy let me know if you have any questions!

jonathanc-n commented 1 month ago

@wudanzy Are you still working on this? If not, may I be assigned this instead?

wudanzy commented 1 month ago

Yes, it is still ongoing, sorry for the delay.