apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.49k stars 2.24k forks source link

Bugfix for incorrect Deletion of Snapshot Metadata Due to OutOfMemoryError #11576

Closed ZhendongBai closed 1 day ago

ZhendongBai commented 3 days ago

bugfix for #11575 Fixed the problem of mistakenly deleting table snapshot metadata file in the event of unexpected errors, like OutOfMemoryError, during the checkCommitStatus method execution. While the org.apache.iceberg.hive.HiveViewOperations.checkCurrentMetadataLocation method performs a refresh operation to download and update the current table metadata, this action consumes memory. To address this, we now set the commit status to UNKNOWN before checking the commit status, thus preventing inadvertent deletion of snapshot metadata files.

pvary commented 3 days ago

@ZhendongBai: Could you please add some unit tests which fail before the patch, and succeed after the fix?

pvary commented 3 days ago

@ZhendongBai: TestHiveCommits contains plenty of examples where the different error scenarios are tested. I think it should be possible to create one which tests the scenario mentioned in the description

ZhendongBai commented 3 days ago

@ZhendongBai: TestHiveCommits contains plenty of examples where the different error scenarios are tested. I think it should be possible to create one which tests the scenario mentioned in the description

@pvary Thanks, I have added some test cases, please review again.

pvary commented 17 hours ago

@ZhendongBai: Thanks for the fix, and sorry for missing your reply. Also thanks @RussellSpitzer for merging! 😄