common-workflow-language / cwltool

Common Workflow Language reference implementation
https://cwltool.readthedocs.io/
Apache License 2.0
332 stars 229 forks source link

cwltool:ProcessGenerator output #1819

Open jfennick opened 1 year ago

jfennick commented 1 year ago

According to the ProcessGenerator documentation, https://github.com/common-workflow-language/cwltool/blob/main/docs/processgen.rst

The output of the generated script is used as the output for ProcessGenerator as a whole.

However, it appears that the final output is the generated process itself (i.e. the output of the embedded script), rather than the output of the generated process.

cwltool --enable-dev --enable-ext pytoolgen.txt zing.txt 
INFO /Users/jakefennick/miniconda3/envs/base/bin/cwltool 3.1.20221201130942
INFO Resolved 'pytoolgen.txt' to 'file:///Users/jakefennick/pytoolgen.txt'
INFO [job cc371ad5-a371-4cf8-a514-28e56d3e15c0] /private/tmp/docker_tmpt87k7ul2$ python \
    inp.py > /private/tmp/docker_tmpt87k7ul2/main.cwl
INFO [job cc371ad5-a371-4cf8-a514-28e56d3e15c0] completed success
INFO [job main.cwl] /private/tmp/docker_tmp0h47zegt$ echo \
    blurf
blurf
INFO [job main.cwl] completed success
{
    "runProcess": {
        "location": "file:///Users/jakefennick/main.cwl",
        "basename": "main.cwl",
        "class": "File",
        "checksum": "sha1$0cbf24b48a78fb6559954b959da950120aeb514f",
        "size": 97,
        "path": "/Users/jakefennick/main.cwl"
    }
}
INFO Final process status is success

It should be noted that the attached pytoolgen.txt explicitly contains outputs: {}, which is apparently ignored.

Moreover, if any other top-level outputs are given (even constant literals) they are again ignored, along with the output of the generated process.

outputs:
  myoutput:
    type: string
    outputBinding:
      outputEval: $("myoutput")
cwltool --enable-dev --enable-ext pytoolgen_myoutput.txt zing.txt 
INFO /Users/jakefennick/miniconda3/envs/base/bin/cwltool 3.1.20221201130942
INFO Resolved 'pytoolgen_myoutput.txt' to 'file:///Users/jakefennick/pytoolgen_myoutput.txt'
URI prefix 'cwltool' of 'cwltool:loop' not recognized, are you missing a $namespaces section?
INFO [job _:8190dc70-d058-42c3-a9cc-4044df633669] /private/tmp/docker_tmpx2398i1e$ python \
    inp.py > /private/tmp/docker_tmpx2398i1e/main.cwl
INFO [job _:8190dc70-d058-42c3-a9cc-4044df633669] completed success
INFO [job main.cwl] /private/tmp/docker_tmpvxeig3jm$ echo \
    blurf
blurf
INFO [job main.cwl] completed success
{
    "runProcess": {
        "location": "file:///Users/jakefennick/main.cwl",
        "basename": "main.cwl",
        "class": "File",
        "checksum": "sha1$0cbf24b48a78fb6559954b959da950120aeb514f",
        "size": 97,
        "path": "/Users/jakefennick/main.cwl"
    }
}
INFO Final process status is success

The plot thickens if we try to wrap a ProcessGenerator in a single step workflow:

cwltool --enable-dev --enable-ext pytoolgen_wrap.txt zing.txt 
INFO /Users/jakefennick/miniconda3/envs/base/bin/cwltool 3.1.20221201130942
INFO Resolved 'pytoolgen_wrap.txt' to 'file:///Users/jakefennick/pytoolgen_wrap.txt'
URI prefix 'cwltool' of 'cwltool:loop' not recognized, are you missing a $namespaces section?
ERROR Tool definition failed validation:
pytoolgen_wrap.txt:8:9: Workflow step output 'runProcess' does not correspond to
pytoolgen.txt:8:1:             tool output (expected '')

So it appears that the outputs are NOT being ignored w.r.t. a parent workflow. Moreover, if we explicitly shadow the (correct?) output, we get a different error:

cwltool --enable-dev --enable-ext pytoolgen_wrap_runProcess.txt zing.txt 
INFO /Users/jakefennick/miniconda3/envs/base/bin/cwltool 3.1.20221201130942
INFO Resolved 'pytoolgen_wrap_runProcess.txt' to 'file:///Users/jakefennick/pytoolgen_wrap_runProcess.txt'
URI prefix 'cwltool' of 'cwltool:loop' not recognized, are you missing a $namespaces section?
INFO [workflow ] start
INFO [workflow ] starting step step1
INFO [step step1] start
INFO [job step1] /private/tmp/docker_tmpk0zpx3sq$ python \
    inp.py > /private/tmp/docker_tmpk0zpx3sq/main.cwl
INFO [job step1] completed success
INFO [job step1_2] /private/tmp/docker_tmprbbqaggh$ echo \
    blurf
blurf
INFO [job step1_2] completed success
ERROR [step step1] Output is missing expected field file:///Users/jakefennick/pytoolgen_wrap_runProcess.txt#step1/runProcess
WARNING [step step1] completed permanentFail
INFO [workflow ] completed permanentFail
{
    "runProcess": null
}
WARNING Final process status is permanentFail

Expected Behavior

pytoolget.txt should return the output from the generated process, not the generated process itself. pytoolgen_myoutput.txt should also return "myoutput"

pytoolgen_wrap.txt and pytoolgen_wrap_runProcess.txt should return the same output as the underlying ProcessGenerator (whatever that might be).

Actual Behavior

In both cases, the generated process itself is returned, not its output.

pytoolgen_wrap.txt and pytoolgen_wrap_runProcess.txt return the above errors.

Workflow Code

pytoolgen_runProcess.txt pytoolgen_myoutput.txt pytoolgen_wrap_runProcess.txt pytoolgen.txt pytoolgen_wrap.txt zing.txt

Full Traceback

See above

Your Environment

tetron commented 1 year ago

This was fixed already in https://github.com/common-workflow-language/cwltool/pull/1813

jfennick commented 1 year ago

Looks like I was one day too late! Thanks, I can confirm that with #1813 the first two examples are now fixed. However, I'm still getting the same errors for the second two examples.

cwltool --enable-dev --enable-ext pytoolgen_wrap.txt zing.txt

cwltool --enable-dev --enable-ext pytoolgen_wrap_runProcess.txt zing.txt