delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.41k stars 1.66k forks source link

[PROTOCOL RFC] Support for collated strings in the schema and statistics #2894

Open olaky opened 4 months ago

olaky commented 4 months ago

Protocol Change Request

Description of the protocol change

Spark is introducing support for collated Strings (see SPARK-46830) and we should support collated columns and fields in Delta tables as well. This will require changes to two parts of the Delta protocol

More details about the idea can be found in the Design Doc

Willingness to contribute

The Delta Lake Community encourages protocol innovations. Would you or another member of your organization be willing to contribute this feature to the Delta Lake code base?

olaky commented 3 months ago

Protocol RFC PR is open: https://github.com/delta-io/delta/pull/3068