yuch7 / cwlexec

A new open source tool to run CWL workflows on LSF
Other
36 stars 8 forks source link

CWLEXEC doesn't correctly run several subworkflows #54

Open KateSakharova opened 5 years ago

KateSakharova commented 5 years ago

Hi! I met am issue when tested running several subworkflows in CWLEXEC. My pipeline works fine in CWL (cwltool) but fails in CWLEXEC. The structure of pipeline is very simple: step 1: -- subworkflow 1: ------ copy file from input to another fille step 2: -- subworkflow 2: ------ grep the output of step 1 (by condition), output is stdout ------ copy result to another file

The error is:

------------------------------------------------------------
Successfully completed.
Resource usage summary:
    CPU time :                                   0.02 sec.
    Max Memory :                                 -
    Average Memory :                             -
    Total Requested Memory :                     -
    Delta Memory :                               -
    Max Swap :                                   -
    Max Processes :                              -
    Max Threads :                                -
    Run time :                                   7 sec.
    Turnaround time :                            1 sec.
The output (if any) is above this job summary.

[13:32:02.086] INFO  - Fill out commands in the script <path>/step-wf-2/step-subwf-1/step-wf-2_step-subwf-1:
grep 2  <command>
[13:32:02.090] INFO  - Resuming job (step-wf-2/step-subwf-1) <1896579> with
bresume \
1896579
[13:32:02.236] INFO  - Started to wait for jobs by
bwait \
-w \
done(1896579)
[13:32:04.773] INFO  - The job (step-wf-2/step-subwf-1) <1896579> is done with stdout from LSF:
------------------------------------------------------------
Job <<path>/step-wf-2/step-subwf-1/step-wf-2_step-subwf-1> was submitted from host <host> by user <user> in cluster <cluster> at Wed Sep 11 13:32:00 2019
Job was executed on host(s) <host>, in queue <queue>, as user <user> in cluster <cluster> at Wed Sep 11 13:32:03 2019
<dirr> was used as the home directory.
<path/step-wf-2/step-subwf-1> was used as the working directory.
Started at Wed Sep 11 13:32:03 2019
Terminated at Wed Sep 11 13:32:03 2019
Results reported at Wed Sep 11 13:32:03 2019
------------------------------------------------------------
# LSBATCH: User input
path/step-wf-2/step-subwf-1/step-wf-2_step-subwf-1
------------------------------------------------------------
Successfully completed.
Resource usage summary:
    CPU time :                                   0.02 sec.
    Max Memory :                                 -
    Average Memory :                             -
    Total Requested Memory :                     -
    Delta Memory :                               -
    Max Swap :                                   -
    Max Processes :                              -
    Max Threads :                                -
    Run time :                                   2 sec.
    Turnaround time :                            3 sec.
The output (if any) is above this job summary.

[13:32:04.837] ERROR - Failed to wait for job step-wf-2/step-subwf-2 <1896578>, null
[13:32:04.837] ERROR - The workflow (test-pipeline) exited with <255>.
[13:32:04.837] WARN  - killing waiting job (step-wf-2/step-subwf-2) <1896578>.

I didn't meet this problem when run steps without subworkflows. But this case is very important for me because I use similar structure with more complicated workflows and tools.

All scripts attached in archive. for_issue.zip

Thank you! Kate