apache / polaris

Apache Polaris, the interoperable, open source catalog for Apache Iceberg
https://polaris.apache.org/
Apache License 2.0
1.13k stars 122 forks source link

[BUG] DROP WITH PURGE tasks are not guaranteed to complete #269

Closed eric-maynard closed 1 month ago

eric-maynard commented 2 months ago

Is this a possible security vulnerability?

Describe the bug

When dropTable is called with purgeRequested, the Polaris server will create Tasks in order to delete data files associated with the table being dropped. However, if the service dies before the tasks are completed they may not resume gracefully on startup. Some of that table's data files may not be cleaned up.

To Reproduce

Call dropTable with purgeRequested, and kill the Polaris service before the relevant Tasks complete.

Actual Behavior

Tasks are not resumed on startup

Expected Behavior

Tasks associated with a purge should reliably complete, or the user should be made aware that they cannot be completed

Additional context

I've met with @collado-mike and others to discuss a long-term solution here, but it may take some time. In the interim, we discussed some workarounds and stopgaps.

System information

n/a