yuch7 / cwlexec

A new open source tool to run CWL workflows on LSF
Other
36 stars 8 forks source link

Fail to write scatter values upon scattering on files #7

Closed drjrm3 closed 6 years ago

drjrm3 commented 6 years ago

We have a 3 step pipeline (map, foo, reduce) where map creates N files, foo transforms a file into another file, and reduce cats all files into one. When we scatter with CWLEXEC on foo, it only peforms foo on one file and delivers it (with success) to reduce even though there is a Java error involved:

16:28:26.595 default [pool-4-thread-2] ERROR c.i.s.c.e.u.outputs.OutputsCapturer - Fail to write scatter values

java.nio.file.FileAlreadyExistsException: /home/jmichael/CWLEXEC/FailToWriteScatterValuesError/workdir/a5efe1f2-2a7d-42d9-906b-049c154ffff2/foo/1.foo.txt
    at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
    at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
    at java.nio.file.Files.newByteChannel(Files.java:361)
    at java.nio.file.Files.createFile(Files.java:632)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.writeScatterValues(OutputsCapturer.java:459)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.findScatterOutputValue(OutputsCapturer.java:444)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.findScatterOuputValue(OutputsCapturer.java:262)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.captureCommandOutputsByType(OutputsCapturer.java:181)
    at com.ibm.spectrumcomputing.cwl.exec.util.outputs.OutputsCapturer.captureCommandOutputs(OutputsCapturer.java:94)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.captureStepOutputs(LSFBwaitExecutorTask.java:373)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.makeStepSuccessful(LSFBwaitExecutorTask.java:142)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.waitSteps(LSFBwaitExecutorTask.java:132)
    at com.ibm.spectrumcomputing.cwl.exec.executor.lsf.LSFBwaitExecutorTask.run(LSFBwaitExecutorTask.java:97)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

FailToWriteScatterValuesError.tar.gz

liuboxa commented 6 years ago

There are two reasons for this issue

  1. We don't separate the work directory for each scatter step, currently, we separate the work directory for each scatter step by scatter index
  2. When we bind the inputs of a scatter step to a command, we cast the inputs from an array to an element, we should not do this casting for scatter step