apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.49k stars 2.24k forks source link

Incorrect Deletion of Snapshot Metadata Due to OutOfMemoryError #11575

Closed ZhendongBai closed 11 hours ago

ZhendongBai commented 3 days ago

Apache Iceberg version

1.1.0

Query engine

Flink

Please describe the bug 🐞

when calling checkCommitStatus method, unexpected errors maybe occur, such as OutOfMemoryError, during the checkCommitStatus method execution. the code show as below:

commitStatus =
   checkCommitStatus(
       viewName,
       newMetadataLocation,
       metadata.properties(),
       () -> checkCurrentMetadataLocation(newMetadataLocation));

During the execution of the org.apache.iceberg.hive.HiveViewOperations.checkCurrentMetadataLocation method's refresh operation to download and update the current table metadata, memory consumption occurs, potentially leading to an OutOfMemoryError. It is important to note that Tasks may not handle Error exceptions, instead throwing them directly. Finally, in the finally block, the cleanupMetadataAndUnlock function may delete the table snapshot metadata file, even if it has been recently committed.

Willingness to contribute

pvary commented 3 days ago

@ZhendongBai: Could you please check if this issue still exists in the current connector version?

ZhendongBai commented 3 days ago

@ZhendongBai: Could you please check if this issue still exists in the current connector version? @pvary Yes, Despite multiple iterations, the core logic of checkCommitStatus, the logic of calling refresh to get the latest table metadata, has remained unchanged, the code in problem description section is main brach code, the the 1.1.0 version code link is https://github.com/apache/iceberg/blob/ede085d0f7529f24acd0c81dd0a43f7bb969b763/core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java#L305, and the refresh method is called at https://github.com/apache/iceberg/blob/ede085d0f7529f24acd0c81dd0a43f7bb969b763/core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java#L336

RussellSpitzer commented 11 hours ago

Fixed in #11576