An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
This PR builds on the base changes which are not yet merged. For changes specific to this PR, please refer to the last commit only.
This PR implements the first part of row tracking support in Delta Kernel, based on the Delta Protocol. Specifically, it includes the following changes:
add a new baseRowId field to AddFile action
implement functionality to assign baseRowId to AddFile actions prior to committing them
maintain the rowIdHighWaterMark of the delta.rowTracking metadata domain during the base row ID assignment, which is the highest assigned fresh row id for the table
Which Delta project/connector is this regarding?
Description
This PR builds on the base changes which are not yet merged. For changes specific to this PR, please refer to the last commit only.
This PR implements the first part of row tracking support in Delta Kernel, based on the Delta Protocol. Specifically, it includes the following changes:
baseRowId
field toAddFile
actionbaseRowId
toAddFile
actions prior to committing themrowIdHighWaterMark
of thedelta.rowTracking
metadata domain during the base row ID assignment, which is the highest assigned fresh row id for the tableHow was this patch tested?
Added tests in
RowTrackingSuite.scala
.Does this PR introduce any user-facing changes?
No.