nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.61k stars 606 forks source link

Dynamic label directive in a process definition #5038

Open jgolob opened 1 month ago

jgolob commented 1 month ago

New feature

I would propose allowing dynamic definition of the label directive in a process, to allow the following to be a valid directive in a process definition:

process SomeName {    
    label { task.attempt > 1 ? 'mem_super' : 'mem_high' }
    errorStrategy { task.exitStatus in 137..140 ? 'retry' : 'terminate' }
    maxRetries 1
.....
}

Usage scenario

It would support the very common use case when needing more memory than is available in one queue to switch not just the memory requested but also the queue (and potentially even engine). Specific examples would be:

Suggest implementation

I am no wizard at groovy or java and have no specific suggestion. I apologize.

bentsherman commented 1 month ago

Note that since a label is essentially a shorthand for a group of process directives, you can accomplish the same by making those process directives dynamic. For example since you mentioned memory and queue:

process {
  memory = { task.attempt > 1 ? 16.GB : 8.GB }
  queue = { task.attempt > 1 ? 'high-memory' : 'regular' }
}
jgolob commented 1 month ago

@bentsherman Absolutely.

But with dynamic labels my hope is to keep the details of an HPC / computational environment abstracted away in the config file, and not hard-wired into each process definition.

For example, queue names will not have a regular pattern (particularly academic / private clouds).

bentsherman commented 1 month ago

You can already do that, the example I gave is what you would put in the config file