WIPACrepo / lta

Long Term Archive
MIT License
2 stars 2 forks source link

Bundler removes broken files of previous attempts #261

Closed blinkdog closed 1 year ago

blinkdog commented 1 year ago

In practice, one of the trickiest bits of automatically recovering quarantined bundles is that the broken files (metadata and zip bundle) still exist on the disk.

When the next attempt comes along, it will fail again, as it refuses to overwrite existing files.

Cleaning these up in case of exception would be nice, but we can't guarantee the process won't be killed (i.e. kubernetes destroys the pod) so we have to handle the 'broken files left in the work directory' case no matter what.

An external cleaner runs into two issues:

So the straightforward solution is: Delete any existing metadata and zip bundle files by the names that we intend to create.

Since they contain a UUID, it is very unlikely that the file would exist by that name unless we were the one to create it. As part of the Bundler process, we already have the proper permissions to delete the files if they exist. If the files do exist, we are removing broken files we aren't going to use; ready to create without conflicts. If the files don't exist, no harm, no foul.

This should make it easier to restart LTA bundles that failed in the bundler step.