Open bentsherman opened 2 weeks ago
The challenge that I see is that I don't think the output files are collected when a task fails in a retryable way. We could just delete the task directory or try to infer the output files from the declared outputs.
Thanks for looking at this! I would be happy with deleting the whole work directory, or just deleting the large(?) non-symlink files within the directory.
In the meantime I have written a rather hacky bash script that checks .errorcode using find, then uses find again to check and write out, then after a grace period remove files with certain endings (.paf, .sam etc). I would be loathe to share it though since automated scripts to delete files are rather risky (even with six checks added to make sure it really is a nextflow work directory where I am doing the deleting).
Tasks that fail and are retried should have their intermediate files cleaned up for the failed attempt. Maybe as an opt-in behavior in case the user wants to debug