nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.69k stars 621 forks source link

task.workDir is null during resume, which breaks exec processes that write files #3962

Open stevekm opened 1 year ago

stevekm commented 1 year ago

Bug report

Expected behavior and actual behavior

When using a Nextflow process with the exec scope, we are required to use the task.workDir object in order to write out files to the current work directory. (source)

However, when -resume is used, task.workDir becomes null and causes the pipeline to break

Steps to reproduce the problem

I have a demo workflow that uses exec to write a file here;

https://github.com/stevekm/nextflow-demos/blob/master/metaMap-write-JSON/main.nf

The offending process looks like this;

process WRITE_JSON {
    publishDir "${params.outputDir}", mode: 'copy'

    input:
    val(meta)

    output:
    path(outputfile), emit: metaJson

    exec:
    println ">>> WRITE_JSON meta: ${meta}"
    json_str = JsonOutput.toJson(meta)
    println ">>> WRITE_JSON json_str: ${json_str}"
    json_indented = JsonOutput.prettyPrint(json_str)
    println ">>> WRITE_JSON json_indented: ${json_indented}"
    outputfile = new File("${task.workDir}/meta.json")
    outputfile.write(json_indented)
}

Program output

It works fine without -resume;

$ nextflow run main.nf
N E X T F L O W  ~  version 22.10.6
Launching `main.nf` [astonishing_lumiere] DSL2 - revision: 26a2d00622
executor >  local (2)
[96/60d8d7] process > WRITE_JSON (1) [100%] 1 of 1 ✔
[b6/d49f95] process > READ_JSON (1)  [100%] 1 of 1 ✔
>>> WRITE_JSON meta: [sampleID:Sample1, sampleType:fooType]
>>> WRITE_JSON json_str: {"sampleID":"Sample1","sampleType":"fooType"}
>>> WRITE_JSON json_indented: {
    "sampleID": "Sample1",
    "sampleType": "fooType"
}
>>> READ_JSON inputJsonPath: /.../nextflow-demos/metaMap-write-JSON/work/96/60d8d77d53904200f7ced710bb790e/meta.json
>>> READ_JSON contents: {
    "sampleID": "Sample1",
    "sampleType": "fooType"
}
>>> READ_JSON metaMap: [sampleID:Sample1, sampleType:fooType]
output channel meta: [sampleID:Sample1, sampleType:fooType]

However, if -resume is used, it breaks;

$ nextflow run main.nf -resume
N E X T F L O W  ~  version 22.10.6
Launching `main.nf` [pedantic_borg] DSL2 - revision: 26a2d00622
executor >  local (1)
[8f/334d37] process > WRITE_JSON (1) [100%] 1 of 1, failed: 1 ✘
[-        ] process > READ_JSON      -
>>> WRITE_JSON meta: [sampleID:Sample1, sampleType:fooType]
>>> WRITE_JSON json_str: {"sampleID":"Sample1","sampleType":"fooType"}
>>> WRITE_JSON json_indented: {
    "sampleID": "Sample1",
    "sampleType": "fooType"
}
WARN: [WRITE_JSON (1)] Unable to resume cached task -- See log file for details
Error executing process > 'WRITE_JSON (1)'

Caused by:
  null/meta.json (No such file or directory) -- Check script 'main.nf' at line: 26

Source block:
  println ">>> WRITE_JSON meta: ${meta}"
  json_str = JsonOutput.toJson(meta)
  println ">>> WRITE_JSON json_str: ${json_str}"
  json_indented = JsonOutput.prettyPrint(json_str)
  println ">>> WRITE_JSON json_indented: ${json_indented}"
  outputfile = new File("${task.workDir}/meta.json")
  outputfile.write(json_indented)

Work dir:
  /.../projects/nextflow-demos/metaMap-write-JSON/work/8f/334d3729a7a65fb9e9608fe1d8d956

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

Environment

Additional context

related: https://github.com/nextflow-io/nextflow/issues/2059

bentsherman commented 1 year ago

I think the plan is to make exec code automatically resolve against the task directory so that you don't need to reference it at all. See #2628

stevekm commented 1 year ago

yea I saw that one as well but did not see any references there or elsewhere to the breakage when using -resume so was not clear what the intended plans for these functionalities were. Thanks

would be really great if some day exec became on par with script and then maybe we can reduce reliance on external custom scripts and instead get more done native with Groovy while still benefiting from the Nextflow process features.

stevekm commented 1 year ago

looks like it also breaks when -work-dir is an S3 bucket

bentsherman commented 1 year ago

@stevekm I think that was fixed in the latest edge release

edupri commented 1 year ago

@bentsherman based on the date of your comment, I am assuming you mean the Version 23.05.0-edge release. Following that, version 23.04.2 was released. I am fairly new to Nextflow so I am not sure if the updates in the edge released made into the official 23.04.2 release, but the issue persists in 23.04.2 (version I am currently using). I have not tried with the latest release yet, but when I use the -resume flag, the task.workDir variable is null

bentsherman commented 1 year ago

Sorry, I was only talking about the S3 work dir. The original issue about the work dir being null hasn't been fixed yet.