Open haizhou-zhao opened 2 weeks ago
Down to code level detail (root cause of why REST differs from Hadoop/Hive), these two methods, seemingly doing similar things when resetting main branch, but one would clear snapshotLog history, one would not:
TableMetadata$Builder::removeRef
will clear snapshot logs on main branch reset: https://github.com/apache/iceberg/blob/f71c7df/core/src/main/java/org/apache/iceberg/TableMetadata.java#L1256
TableMetadata$Builder::resetMainBranch
will not clear snapshot logs on main branch reset:
https://github.com/apache/iceberg/blob/f71c7df/core/src/main/java/org/apache/iceberg/TableMetadata.java#L1270
Hadoop/Hive catalog only executes TableMetadata$Builder::resetMainBranch
on replace table calls, while REST catalog (if server depends on TableMetadata class provided by Open Source release, like reference implementation of RESTAdapter over JdbcCatalog) will run TableMetadata$Builder::removeRef
in addition on replace table calls.
Query engine
Spark
Question
Background
This is another Hive/Hadoop and REST Catalog behavior discrepancies discovered while enabling integration test on REST catalog. The assumption here is all the existing Spark integration tests should pass as-is on REST Catalog just like how they would pass using Hive & Hadoop Catalog, because conceptually Spark expects the same behavior out of Iceberg, no matter what catalog types are used. Reference issue: https://github.com/apache/iceberg/issues/11079
What
https://github.com/apache/iceberg/blob/main/spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java#L813
This integration test passes on Hive & Hadoop Catalog, but does not pass on REST Catalog reference implementation (RESTCatalogAdapter over JdbcCatalog).
Why
Root Cause is REST Catalog reference implementation, comparing to Hive & Hadoop Catalog, will run extra logic on server side when responding to a
CREATE OR REPLACE ${table}
spark command. ACREATE OR REPLACE ${table}
command will trigger aRemoveSnapshotRef
(implements MetadataUpdate
) change to be sent (within UpdateTableRequest) to REST Catalog server side. When the reference implementation server process this change, it will runremoveRef
method when building the replacement metadata, within which, snapshot log is cleared: https://github.com/apache/iceberg/blob/113c6e7/core/src/main/java/org/apache/iceberg/TableMetadata.java#L1250. However, Hive and Hadoop Catalog does not run that method when building replacement metadata. At the end of the day, runningCREATE OR REPLACE ${table}
on top of REST Catalog will result in failure of validating snapshot log is kept intact afterCREATE OR REPLACE
command.Questions
snapshot-log
should get cleared, then doessnapshots
list itself need to be cleared as well?snapshot-log
orsnapshots
on table replacement; and that since REST catalog implementation is not controlled by this repo, it can choose its implementation details freely (whether clear or not clearsnapshot-log
&snapshots
on replacement). In this case, should we change the REST Catalog reference implementation (RESTCatalogAdapter on JdbcCatalog) so that its behavior is the same as Hive & Hadoop Catalog (where snapshot history is not cleared on replacement)? Or should we change the integration test so that we don't check snapshot history being intact after table replacement?