Closed lintingbin closed 3 weeks ago
@lintingbin Thanks for reporting this bug and trying to fix it.
I suggest you create an issue and link this PR to it. It will help a lot when others try to search for the same issues.
@zhoujinsong The issue has been added and linked to this PR.
@zhoujinsong The issue has been added and linked to this PR.
Thanks for that.
We can add a prefix [AMORO-${issue number}]
to the PR's title(I have added it for you this time).
Why are the changes needed?
When creating an Iceberg table on OSS and executing clean-orphan-file, there are normal data_files (just created but not yet committed to Iceberg) being cleaned up. The clean-orphan-file.min-existing-time-minutes parameter is not taking effect.
Through debugging, it was found that because OSS does not record the file access time, getAccessTime returns 0, causing the clean-orphan-file.min-existing-time-minutes parameter to become ineffective.
At the same time, the listPrefix function of Iceberg also uses getModificationTime. So using getModificationTime should be a better choice.
Brief change log
How was this patch tested?
[ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
[x] Add screenshots for manual tests if appropriate
[x] Run test locally before making a pull request
Documentation