redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.65k stars 589 forks source link

iceberg: table update applier #23488

Closed andrwng closed 1 month ago

andrwng commented 1 month ago

Adds a method to apply a given update to a table_metadata. There are some basic, mechanical validations that are performed (e.g. to avoid adding a duplicate schema).

These updates will be generated as a result of various actions (e.g. appending to the table) and sent to the catalog. The applying of these updates to an in-memory table_metadata allows us to validate and build up multi-action transactions whose updates can be grouped together and sent to the catalog together.

Note for reviewers: similar apply operations can be found in the python library here, see all implementations of _apply_table_update.

Backports Required

Release Notes

andrwng commented 1 month ago

Similar apply operations can be found here https://github.com/apache/iceberg-python/blob/e5a58b34dd830c6ffea11649613b693f70f7cbb4/pyiceberg/table/update/__init__.py#L87, see all implementations of _apply_table_update