radiocosmology / alpenhorn

Alpenhorn is a service for managing an archive of scientific data.
MIT License
2 stars 1 forks source link

feat(io/default): Try to clean up during idle updates #182

Closed ketiltrout closed 2 months ago

ketiltrout commented 2 months ago

This adds an attempt to clean up a DefaultIONode during an idle update by:

Because this routine runs when the node is idle (i.e. only when there's no other I/O occurring), no placeholders should be on the node. Any which are found are clearly spurious due to prior crashes.

I've also implemented a check to reduce how often it runs. It will always run at start-up (when I suspect most uncleanliness would be found), and then once every 100 times the node transitions from not-idle to idle. Not really sure how often is appropriate. It might even be sufficient to only run it on start-up.

While implementing this, I discovered that the code that was deleting acq dirs wasn't stopping at the StorageNode.root, meaning there was a potential to delete the node directory itself (plus anything above that)!

In practice, on DefaultIO nodes, this couldn't happen because all such nodes have a ALPENHORN_NODE file at the top level, but that's not necessarily true for other IO classes which still use the DefaultIO's delete function (for example, the LustreHSM I/O class).

I've fixed this bug while moving the directory deletion code from the delete_async into its own function in ioutil because the cleanup task is now also using it.

Also, removed submitting an uncessary job which was deleting zero file copies.