rabix / bunny

[Legacy] Executor for CWL workflows. Executes sbg:draft-2 and CWL 1.0
http://rabix.io
Apache License 2.0
74 stars 28 forks source link

TES backend workflow bug #410

Closed adamstruck closed 6 years ago

adamstruck commented 6 years ago

I have a two step workflow:

samtools-workflow.tar.gz

The first step runs and creates the outputs I would expect. However, the TES task message created for the second step doesn't contain any inputs. The only error I saw in the logs is below. Note that bunny seems to be aware of the input file; but it is assigning the wrong location to the secondary file (see the last log line).

org.rabix.backend.tes.service.TESStorageException: /funnel-test/bunny/b0a467b2-93c4-4768-b746-816c6d093642/root/index/NA12878.bam.bai
    at org.rabix.backend.tes.service.impl.LocalTESStorageServiceImpl.stageFile(LocalTESStorageServiceImpl.java:99)
    at org.rabix.backend.tes.service.impl.LocalTESStorageServiceImpl.stageFile(LocalTESStorageServiceImpl.java:103)
    at org.rabix.backend.tes.service.impl.LocalTESStorageServiceImpl.stageFile(LocalTESStorageServiceImpl.java:73)
    at org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl$TaskRunCallable.lambda$call$2(LocalTESWorkerServiceImpl.java:256)
    at java.util.ArrayList.forEach(ArrayList.java:1257)
    at org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl$TaskRunCallable.call(LocalTESWorkerServiceImpl.java:254)
    at org.rabix.backend.tes.service.impl.LocalTESWorkerServiceImpl$TaskRunCallable.call(LocalTESWorkerServiceImpl.java:220)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
[2018-01-16 09:49:47.399] [DEBUG] Building command line parts...
[2018-01-16 09:49:47.399] [DEBUG] Building command line part for value {metadata=null, format=null, dirname=s3://s3.us-west-2.amazonaws.com/funnel-test/bunny/b0a467b2-93c4-4768-b746-816c6d093642/root/index/, nameroot=NA12878, path=/funnel-test/bunny/b0a467b2-93c4-4768-b746-816c6d093642/root/index/NA12878.bam, basename=NA12878.bam, size=15236350, nameext=.bam, contents=null, checksum=sha1$54b5851dcaf8f10ff7ff136d0bd9ad59905d552f, location=s3://s3.us-west-2.amazonaws.com/funnel-test/bunny/b0a467b2-93c4-4768-b746-816c6d093642/root/index/NA12878.bam, secondaryFiles=[{metadata=null, format=null, dirname=/funnel-test/bunny/b0a467b2-93c4-4768-b746-816c6d093642/root/index, nameroot=NA12878.bam, path=/funnel-test/bunny/b0a467b2-93c4-4768-b746-816c6d093642/root/index/NA12878.bam.bai, basename=NA12878.bam.bai, size=null, nameext=.bai, contents=null, checksum=null, location=file:///funnel-test/bunny/b0a467b2-93c4-4768-b746-816c6d093642/root/index/NA12878.bam.bai, secondaryFiles=[], class=File}], class=File} and schema File
[2018-01-16 09:49:47.399] [INFO] Command line built. CommandLine = CommandLine [parts=[samtools, idxstats, /funnel-test/bunny/b0a467b2-93c4-4768-b746-816c6d093642/root/index/NA12878.bam], standardIn=null, standardOut=/Users/strucka/scratch/cwl/samtools/inputs/b0a467b2-93c4-4768-b746-816c6d093642/root/idxstats/idxstats.txt, standardError=null]
adamstruck commented 6 years ago

This issue is specific to s3. Maybe it related to #398?

milos-ljubinkovic commented 6 years ago

2 bugs here in fact, we were ignoring the secondary files when they are defined as a single string instead of as an array; and they were being resolved using the original file's path when they should have used its location.

Added some quick changes to the bugfix/s3-files branch and it looks ok.

adamstruck commented 6 years ago

That fixed it thanks!