delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.17k stars 1.62k forks source link

[Spark] Protocol version downgrade in the presence of table features #2841

Closed andreaschat-db closed 1 month ago

andreaschat-db commented 2 months ago

Which Delta project/connector is this regarding?

Description

This PR adds support for protocol versions downgrade when table features exist in the protocol. The downgraded protocol versions should be the minimum required to support all available table features. For example, Protocol(3, 7, DeletionVectors, RowTracking) can be downgraded to Protocol(1, 7, RowTracking) after removing the DV feature.

How was this patch tested?

Added new UTs in DeltaProtocolVersionSuite. Furthermore, existing UTs cover a significant part of the functionality. These these are the following:

Does this PR introduce any user-facing changes?

Yes. Dropping a table feature from a table with multiple features may now result to a Protocol versions downgrade. For example, Protocol(3, 7, DeletionVectors, RowTracking) can now be downgraded to Protocol(1, 7, RowTracking).

felipepessoto commented 2 months ago

Is the description correct? Reader v1 doesn’t support DV

larsk-db commented 2 months ago

Is the description correct? Reader v1 doesn’t support DV

I think the intent of "can be downgraded" was an implicit "after removing the DV feature".