nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.61k stars 605 forks source link

The number of completed tasks from the final report does not match the number of actually started tasks in case of an error #5086

Open cgoina opened 1 week ago

cgoina commented 1 week ago

Bug report

The number of tasks started by a workflow does not match the number in the final report if the report fails.

Expected behavior and actual behavior

When the workflow started nextflow correctly reports the number of running tasks for (0 out of 3) in the example below:

32/7dd20f] P1 (1) [ 0%] 0 of 3 [- ] ERROR -

But if let's say 2 of these tasks complete successfully and 1 doesn't the final reports states: [d6/a6925d] P1 (3) [100%] 2 of 2 [af/6915b5] ERROR (1) [100%] 2 of 2, failed: 2 ERROR ~ Error executing process > 'ERROR (2)'

I would expect in the final report to see for P1 2 out of 3 completed not 2 out of 2 [d6/a6925d] P1 (3) [66%] 2 of 3 [af/6915b5] ERROR (1) [100%] 2 of 2, failed: 2 ERROR ~ Error executing process > 'ERROR (2)'

For ERROR to report 2 of 2 is OK because only 2 tasks were started

Steps to reproduce the problem

process P1 {
    input:
    tuple val(id), val(secs)

    output:
    tuple val(id), val(secs)

    script:
    """
    echo "$id -> sleep $secs"
    sleep $secs
    """
}

process ERROR {
    input:
    tuple val(id), val(secs)

    script:
    """
    exit 1
    """
}

workflow {
   ch = Channel.of(
      [1, 10],
      [2, 10],
      [3, 1000],
   )

   ch | P1 | ERROR
}

Program output

nextflow.log

You can see that 3 P1 tasks were started and in the end it only reports 2 successfully completed at 100% instead of 2 out of 3 completed

The problem is even more evident if the input ch is: ch = Channel.of( [1, 10], [2, 10], [3, 10], [4, 10], [5, 10], [6, 1000], [7, 1000], [8, 1000], )

Environment

Additional context

This reporting is extremely confusing when viewed in tower which reports error and 100% of the tasks completed