nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.79k stars 634 forks source link

params resolution for dicts #3533

Open HalfPhoton opened 1 year ago

HalfPhoton commented 1 year ago

Bug report

Expected behavior and actual behavior

The parameter resolution / prioritisation main.nf < nextflow.config < cli, doesn't work as expected for dictionaries which are defined in the "assignment style" as shown in the example:

I'd expect both to assignment style and dict style definitions to resolve in the same way.

In this example we see that where resolves to nextflow.config as expected. For dicts however, we see that the style used to define the parameters in main.nf has an effect on the resolution. This behaviour is the same for cli arguments too.

// nextflow main.nf --cli_bad.where "cli" --cli_ok.where "cli"

/* nextflow.config:
params.where = "nextflow.config"
params {
    dict_ok {
        where = "nextflow.config"
    }
    dict_bad {
        where = "nextflow.config"
    }
}
*/

// main.nf parameters overwritten by nextflow.config or the cli arguments 

/* nextflow.config:
params.where = "nextflow.config"
params {
    dict_ok {
        where = "nextflow.config"
    }
    dict_bad {
        where = "nextflow.config"
    }
}
*/

// nextflow run 

// main.nf parameters overwritten by nextflow.config
params.where = "main.nf"             // simple assignment
params.dict_ok = [where: "main.nf"]  // dict assignment style
params.dict_bad.where = "main.nf"    // attribute assignment style

params.cli_bad = [:]                 // attribute assignment style
params.cli_bad.where = "main.nf"    

params.cli_ok = [where: "main.nf"]   // dict assignment style

process SIMPLE {
    debug true
    script:
    """
        echo "params.where            OK   nextflow_config = ${params.where}"
        echo "params.dict_ok.where    OK   nextflow.config = ${params.dict_ok.where} "
        echo "params.dict_bad.where   --   nextflow.config = ${params.dict_bad.where}"
        echo "params.cli_ok.where     OK   cli = ${params.cli_ok.where}"
        echo "params.cli_bad.where    --   cli = ${params.cli_bad.where}"
    """
}

workflow {
    SIMPLE()
}

/* Result:
params.where            OK   nextflow_config = nextflow.config
params.dict_ok.where    OK   nextflow.config = nextflow.config 
params.dict_bad.where   --   nextflow.config = main.nf
params.cli_ok.where     OK   cli = cli
params.cli_bad.where    --   cli = main.nf
*/

Steps to reproduce the problem

Example code above with: nextflow main.nf --cli_bad.where "cli" --cli_ok.where "cli"

Program output

Shown in the example above

Environment

Additional context

This might not be a bug but a misunderstanding but it would be nice if this was mentioned in the docs.

I hope this helps anyone else out as well.

fpichon commented 1 year ago

Hi all! I have similar issue with config and dict: I set my config file like this:

params {
    jellyfish {
        mer_len = 30
        size = '100M'
        keep_jf = 'false'
    }
}

When I run the command, willing to change one of the value, I proceed like this:

nextflow run --jellyfish.mer_len 50 main.nf

The value is correctly set, but the two other (size and keep_jf) disappeared and are unset. It seems that the variables set in jellyfish{} are reinitialized instead of being updated. Is it a wanted behavior ? Should not jellyfish be updated instead ? Thanks!

fpichon commented 2 months ago

Any advance on this issue ?

bentsherman commented 1 month ago

I think nested params were never fully supported. Some things work like defining nested params in config, assigning them in the script or on the command line, but there are many edge cases like the ones you have identified

Overall, I would discourage the use of nested params in favor of a flat set of params. You can still have map/dict params with nested values, and still access them as if they were nested params e.g. params.foo.bar.baz, but they should always be defined as a group rather than assuming that they will be deeply merged.

In future I think we should try to clarify this behavior and phase out nested params more explicitly

fpichon commented 1 month ago

OK, thanks a lot for your answer.