nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.79k stars 633 forks source link

Variables within closures throw error that they are already defined #804

Open lukasjelonek opened 6 years ago

lukasjelonek commented 6 years ago

Bug report

Expected behavior and actual behavior

Using some groovy code before the script generation string or in the exec block causes a bug when a variable is declared in a closure. The example below contains a workaround by introducing a new variable and assigning it with the value that shall be used.

Program output

N E X T F L O W  ~  version 0.30.2
Launching `./main.nf` [silly_austin] - revision: 34c2c42f4d
ERROR ~ Variable `z` already defined in the process scope @ line 30, column 20.
           int a =  y[z]
                      ^

1 error

Steps to reproduce the problem


Channel.from(1,2,3,4).into{ch_1_1; ch_1_2}
Channel.from(1,1,1,1,1,1).into{ch_2_1; ch_2_2}

process works {

    input:
    val x from ch_1_1.collect()
    val y from ch_2_1.collect()

    script:
    x.each{
        int z = it
        println (z)
        int a =  y[z]
        println a
    }
    """echo done"""
}

process works_not {

    input:
    val x from ch_1_2.collect()
    val y from ch_2_2.collect()

    script:
    x.each{ z ->
        println (z)
        int a =  y[z]
        println a
    }
    """echo done"""

}

Environment

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

robsyme commented 4 years ago

I'm seeing the same problem

process VariableTesting {
    input:
    val foo

    exec:
    if(foo[0] == "a") log.info "Found an A"
    def bar = foo.find {it == "d"}
}

workflow {
    inputs = Channel.from(["a", "b", "c"], ["d", "e", "f"])

    inputs | VariableTesting
}
N E X T F L O W  ~  version 20.10.0
Launching `./main.nf` [infallible_solvay] - revision: 365bf7c54f
Script compilation error
- file : /path/to/main.nf
- cause: Variable `foo` already defined in the process scope @ line 28, column 15.
       def bar = foo.find {
                 ^

1 error

It works if I comment out one of the two exec lines, but if foo is used in both the if statment and in the find, then the error occurs.

krokicki commented 3 years ago

I have the same problem. Here's my process which works as written:

process prepare_spark_work_dir {
    container = "${params.spark_container_repo}/${params.spark_container_name}:${params.spark_container_version}"
    label 'small'

    input:
    val(spark_work_dir)
    val(terminate_name)

    output:
    val(spark_work_dir)

    script:
    def cluster_id = UUID.randomUUID()
    def cluster_work_dir = "${spark_work_dir}/${cluster_id}"
    def terminate_file_name = get_terminate_file_name(cluster_work_dir, terminate_name)
    def write_session_id = create_write_session_id_script(cluster_work_dir)
    log.debug "Cluster work directory: ${cluster_work_dir}"
    """
    if [[ ! -d "${cluster_work_dir}" ]] ; then
        mkdir -p "${cluster_work_dir}"
    else
        rm -f ${cluster_work_dir}/* || true
    fi
    ${write_session_id}
    """
}

When I add a debug statement at the top of the script, it breaks:

process prepare_spark_work_dir {
    container = "${params.spark_container_repo}/${params.spark_container_name}:${params.spark_container_version}"
    label 'small'

    input:
    val(spark_work_dir)
    val(terminate_name)

    output:
    val(spark_work_dir)

    script:
    log.debug "Cluster work directory: ${spark_work_dir}"
    def cluster_id = UUID.randomUUID()
    def cluster_work_dir = "${spark_work_dir}/${cluster_id}"
    def terminate_file_name = get_terminate_file_name(cluster_work_dir, terminate_name)
    def write_session_id = create_write_session_id_script(cluster_work_dir)
    log.debug "Cluster work directory: ${cluster_work_dir}"
    """
    if [[ ! -d "${cluster_work_dir}" ]] ; then
        mkdir -p "${cluster_work_dir}"
    else
        rm -f ${cluster_work_dir}/* || true
    fi
    ${write_session_id}
    """
}
N E X T F L O W  ~  version 21.04.1
Launching `./pipelines/n5_converter.nf` [soggy_lavoisier] - revision: 2e1927bac5
Module compilation error
- file : /groups/scicompsoft/home/rokickik/dev/expansion-microscopy-pipeline/pipelines/../workflows/../external-modules/spark/lib/./processes.nf
- cause: Variable `spark_work_dir` already defined in the process scope @ line 15, column 31.
       def cluster_work_dir = "${spark_work_dir}/${cluster_id}"
                                 ^
1 error
stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

mjhipp commented 2 years ago

Issue still present on v21.12.1-edge and v22.03.0-edge

taylorreiter commented 2 years ago

I just encountered this with 22.10.0. Changing the order of my if and def statements under script: so the if statements occurred after the def statements fixed it for me.

blaurenczy commented 1 year ago

I have this same issue when using the params object: print(params) produces the following error:

  - file : /mnt/ssd_disk/git/wes/pipeline_test.nf
  - cause: Variable `params` already defined in the process scope @ line 21, column 44.