Open robsyme opened 1 month ago
If it's not supported,likely there's a reason ..
It should be possible since the task inputs are resolved before the hash is computed: https://github.com/nextflow-io/nextflow/blob/e2e608140cdde1da39df4c911f56286015538228/modules/nextflow/src/main/groovy/nextflow/processor/TaskProcessor.groovy#L2240-L2241
Unless anyone has picked this up, I will give it a shot!
Feel free. I think you just need to get the hash mode from the TaskConfig
instead of the ProcessConfig
What's the use case for this?
To force a specific task to be recomputed.
We had a case where even though the task exited with exitstatus 0, the output files were incomplete/corrupted. The user didn't have easy access to aws s3 rm s3://bucket/path/to/longtaskhashgoeshere/.exitcode
so it would have been convenient to set cache = false
for a specific task based on the meta.id
.
Too smart! but then, I'd would be nicer to have run
option for it e.g. -invalidate-tasks <names>
I always forget that we can just add new features :D
How would you address the task to be retried? By task hash?
I was think just process name(s)
Process-level cache invalidation is already possible:
process {
withName: Example {
cache = false
}
}
The problem we're trying to solve here is task level cache invalidation, so you'd need a way to address a specific task. My feeling is that you'd either need to use the task-level variables meta.id
, for example or the task hash.
New feature
It would be helpful to be able to set the
cache
directive via a closure.Usage scenario
It would be sometimes helpful to force a re-run of a specific task (in cases where the outputs are corrupted, for example). For users that don't have access to the run workdir, it would be helpful to set the following configuration:
At the moment, this closure is not evaluated, and is simply compared directly do the available options, and we get the warning:
Suggest implementation
Something in
ProcessConfig
, I suppose :D