yuch7 / cwlexec

A new open source tool to run CWL workflows on LSF
Other
36 stars 8 forks source link

Fails to resolve dependencies when one subworkflow relies on output of another #16

Closed biokcb closed 6 years ago

biokcb commented 6 years ago

Hi,

We have a workflow which has two steps. Each step calls a subworkflow that calls a command line tool. When the steps/subworkflows are independent of each other the run succeeds (two_input_workflow.cwl), but when one step relies on output of another step (one_input_workflow.cwl) it fails to resolve the workflow, giving the following error:

09:18:12.103 default [pool-2-thread-1] ERROR c.i.s.c.e.e.CWLInstanceSchedulerTask - Fail to run one_input_workflow (Failed to resolve the step (flow2) dependencies.)
09:18:12.105 default [pool-2-thread-1] ERROR c.i.s.c.e.e.CWLInstanceSchedulerTask - The exception stacks:
com.ibm.spectrumcomputing.cwl.model.exception.CWLException: Failed to resolve the step (flow2) dependencies.
    at com.ibm.spectrumcomputing.cwl.exec.util.CWLStepBindingResolver.resolveStepInput(CWLStepBindingResolver.java:178)
    at com.ibm.spectrumcomputing.cwl.exec.util.CWLStepBindingResolver.resolveStepInput(CWLStepBindingResolver.java:142)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowStepRunner.prepareStepCommand(LSFWorkflowStepRunner.java:158)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowStepRunner.resovleExpectDependencies(LSFWorkflowStepRunner.java:111)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowStepRunner.<init>(LSFWorkflowStepRunner.java:65)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowRunner.addSteps(LSFWorkflowRunner.java:272)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowRunner.<init>(LSFWorkflowRunner.java:99)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFWorkflowRunner.runner(LSFWorkflowRunner.java:92)
    at com.ibm.spectrumcomputing.cwl.exec.executor.CWLInstanceSchedulerTask.schedule(CWLInstanceSchedulerTask.java:76)
    at com.ibm.spectrumcomputing.cwl.exec.executor.CWLInstanceSchedulerTask.run(CWLInstanceSchedulerTask.java:62)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

The full output (1inp.out) is attached along with both the failure and success example. This seems to happen with just subworkflows as far as we can tell.

SubworkflowDependenciesError.tar.gz

skeeey commented 6 years ago

This is a bug, when using a subflow name as a reference, see e0072bf

skeeey commented 6 years ago

The -w and -o should use absolute path, e.g.

cwlexec -X -pe PATH -w workdir -o outdir two_input_workflow.cwl inp.yml > 2inp.out

should be

cwlexec -X -pe PATH -w $(pwd)/workdir -o $(pwd)/outdir two_input_workflow.cwl inp.yml > 2inp.out