nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.77k stars 630 forks source link

resourceLabels in config are overwritten by process #5151

Open tvanbaak opened 4 months ago

tvanbaak commented 4 months ago

Bug report

Expected behavior and actual behavior

resourceLabels can be specified in a pipeline process multiple times and the results are all merged together into the final set of name-value pairs. resourceLabels can also be specified in the config. There are some unintuitive behaviors related to this:

  1. resourceLabels can't be specified multiple times in a config. The later config value replaces the earlier one rather than merging the values together the way it works in a process block in the pipeline.

  2. resourceLabels in the pipeline replaces, rather than merges with, the value of resourceLabels in the config.

If this is as intended, the documentation of resourceLabels should mention this.

It might be desirable to change the behavior to be consistent, but it would probably result in a backwards-incompatible change no matter how you do it.

Steps to reproduce the problem

nextflow.enable.dsl=2

process SAY_HELLO {
    resourceLabels four: 'four'
    resourceLabels five: 'five'
    script:
    """
    echo "Process: ${task.process}"
    echo "Tag: ${task.tag}"
    echo "Resource labels: ${task.resourceLabels}"
    """
}

workflow {
  SAY_HELLO()
}
process {
    debug = true
    tag = "one"
    resourceLabels = [ "two": "two" ]
    resourceLabels = [ "three": "three" ]
}

Program output

Running with the above:

$ nextflow run hello.nf
N E X T F L O W  ~  version 22.10.6
Launching `hello.nf` [exotic_elion] DSL2 - revision: cb40012529
executor >  local (1)
[f9/6b5a77] process > SAY_HELLO (one) [100%] 1 of 1 ✔
Process: SAY_HELLO
Tag: one
Resource labels: [four:four, five:five]

Note the absence of two and three.

If you remove the four and five from hello.nf, you get the following instead:

$ nextflow run hello.nf
N E X T F L O W  ~  version 22.10.6
Launching `hello.nf` [tiny_leavitt] DSL2 - revision: 255098e2b2
executor >  local (1)
[76/dad434] process > SAY_HELLO (one) [100%] 1 of 1 ✔
Process: SAY_HELLO
Tag: one
Resource labels: [three:three]

Note the absence of two.

Environment

Additional context

(Add any other context about the problem here)

bentsherman commented 4 months ago

My guess is that resourceLabels needs to be added to the list of repeatable directives: https://github.com/nextflow-io/nextflow/blob/12b027ee7e70d65bdee912856478894af4602170/modules/nextflow/src/main/groovy/nextflow/script/ProcessConfig.groovy#L119

bentsherman commented 4 months ago

Also, even if this issue is fixed, you can't repeat settings in the config. The second one will overwrite the first.

pditommaso commented 4 months ago

It's not repeatable on purpose. The docs should how to provide multiple labels in the same directive

process my_task {
    resourceLabels region: 'some-region', user: 'some-username'

    '''
    <task script>
    '''
}
tvanbaak commented 4 months ago

It's not repeatable on purpose.

I repeated it in hello.nf above and it worked. That's what made me think it should work to set it in both config and pipeline. Is that behavior a bug, then?

bentsherman commented 4 months ago

I think it worked in the process definition because of how the directive is defined: https://github.com/nextflow-io/nextflow/blob/12b027ee7e70d65bdee912856478894af4602170/modules/nextflow/src/main/groovy/nextflow/script/ProcessConfig.groovy#L755-L768

But I wonder if it was written this way only to support merging of process definition and process config

risoms commented 1 week ago

With the use of withName, it is possible to share resource labels across modules. Example:

process {
    resourceLabels = [
        'Project': 'pipeline'
    ]
    withName: debug {
        resourceLabels += ['pipeline:module': 'debug']
        cpus = 2
    }
}

However this doesn't seem to be possible from within a process.

bentsherman commented 6 days ago

Also, the += for modifying config options is not supported by the strict config syntax in the language server (and eventually Nextflow itself)