An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
During commit we validate that AddFile actions cannot contain Deletion Vectors when DVs are not enabled for a table (table property). This restriction is incorrect for actions that update metadata of existing files, e.g. ComputeStatistics or RowTrackingBackfill. The current code skips the check for ComputeStatistics operation but not for other operations that perform in-place-metadata updates. The new isInPlaceFileMetadataUpdate method is added to Delta operations so that we can easily distinguish such operations.
The getAssertDeletionVectorWellFormedFunc function is slightly refactor to be more readable.
Which Delta project/connector is this regarding?
Description
During commit we validate that
AddFile
actions cannot contain Deletion Vectors when DVs are not enabled for a table (table property). This restriction is incorrect for actions that update metadata of existing files, e.g.ComputeStatistics
orRowTrackingBackfill
. The current code skips the check forComputeStatistics
operation but not for other operations that perform in-place-metadata updates. The newisInPlaceFileMetadataUpdate
method is added to Delta operations so that we can easily distinguish such operations.The
getAssertDeletionVectorWellFormedFunc
function is slightly refactor to be more readable.How was this patch tested?
Existing tests provide coverage.
Does this PR introduce any user-facing changes?
No