Pipeline uses cached process despite edited eval script

CormacKinsella commented 1 week ago

Bug report

Despite editing an eval script, a cached version of the process will be used under certain circumstances

Expected behavior and actual behavior

Expected: editing an eval script (e.g. from multiqc --version to multiqc --version | sed "s/version//") will be picked up as an edit and therefore a cached version won't be used.
Actual behaviour: it works correctly if resume = true is stated within the main workflow script. However if -resume is provided on command line, or resume = true is given in nextflow.config, then the cached version is used after a change to the eval script

Steps to reproduce the problem

Generate nextflow.config and main.nf from below example
Run the pipeline
Edit the eval statement to 'multiqc --version | sed "s/version//"'
Rerun -> nothing changes about the output

resume = true
apptainer.enabled = true
apptainer.autoMounts = true

nextflow.enable.dsl=2
nextflow.preview.topic = true

process FOO {
    tag "${id}"
    container "quay.io/biocontainers/multiqc:1.14--pyhdfd78af_0"

    input:
    val(id)

    output:
    tuple val(task.process), eval('multiqc --version'), topic: versions

    script:
    """
    """
}

workflow {
    input_ch = Channel.from("Sample1") 

    FOO(input_ch)

    channel.topic('versions').view()
}

Program output

Before edit: [FOO, multiqc, version 1.14]

After edit: [FOO, multiqc, version 1.14]

Should be: [FOO, multiqc, 1.14] # Note that this behaviour is achieved by moving resume = true from nextflow.config to main.nf

Environment

Nextflow version: version 24.10.0 build 5928
Java version: openjdk 11.0.24 2024-07-16
Operating system: Ubuntu 22.04.5 LTS
Bash version: GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)

Additional context

bentsherman commented 1 week ago

Good point, the eval script was never added to the task hash

bentsherman commented 10 hours ago

@jorgee this one should be pretty easy if you'd like to try. Basically we need to add the eval outputs to the task hash here:

https://github.com/nextflow-io/nextflow/blob/8041a5799bd2187e4758d453a924ffc75be4e656/modules/nextflow/src/main/groovy/nextflow/processor/TaskProcessor.groovy#L2200-L2205

Something like this:

        // add inputs ...
        // ...

        // add eval outputs
        for( Map.Entry<OutParam,Object> it : task.outputs ) {
            if( it.key instanceof CmdEvalParam )
                keys.add( it.key.getTarget(task.context) )
        }

nextflow-io / nextflow