common-workflow-library / legacy

Deprecated
https://github.com/common-workflow-library/bio-cwl-tools
Apache License 2.0
100 stars 62 forks source link

Error when docker added for Picard tools. #41

Closed FarahZKhan closed 8 years ago

FarahZKhan commented 8 years ago

Hello, I am working on adding docker requirement to the picard tools as discussed on gitter. After adding docker, createDictionary tool of picard works fine. But I am getting following error while doing the same exact steps for markDuplicates:

[job 4375347792] /Users/farahkhan/Google Drive/Galaxy_Cpipe_taverna/CWL/GATK-worflow/tools$ docker run -i '--volume=/Users/farahkhan/Google Drive/Galaxy_Cpipe_taverna/CWL/GATK-worflow/tools/outputFiles/sortedSam.bam.bai:/tmp/job344921763_outputFiles/sortedSam.bam.bai:ro' '--volume=/Users/farahkhan/Google Drive/Galaxy_Cpipe_taverna/CWL/GATK-worflow/tools/outputFiles/sortedSam.bam:/tmp/job344921763_outputFiles/sortedSam.bam:ro' '--volume=/Users/farahkhan/Google Drive/Galaxy_Cpipe_taverna/CWL/GATK-worflow/tools:/tmp/job_output:rw' '--volume=/Users/farahkhan/Google Drive/Galaxy_Cpipe_taverna/CWL/GATK-worflow/tools/tmpOutputrsy9YD:/tmp/job_tmp:rw' --workdir=/tmp/job_output --read-only=true --user=1000 --rm --env=TMPDIR=/tmp/job_tmp --env=PATH=/usr/local/bin/:/usr/bin:/bin broadinstitute/picard MarkDuplicates INPUT= /tmp/job344921763_outputFiles/sortedSam.bam OUTPUT= markDups.bam METRICS_FILE= metricsFile-markDups ASSUME_SORTED= true REMOVE_DUPLICATES= true MAX_FILE_HANDLES_FOR_READ_ENDS_MAP= 8000 SORTING_COLLECTION_SIZE_RATIO= 0.25 PROGRAM_RECORD_ID= MarkDuplicates PROGRAM_GROUP_NAME= MarkDuplicates READ_NAME_REGEX= '[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).*' OPTICAL_DUPLICATE_PIXEL_DISTANCE= 100 CREATE_INDEX= true
[Thu Dec 10 06:18:20 UTC 2015] picard.sam.markduplicates.MarkDuplicates MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 INPUT=[/tmp/job344921763_outputFiles/sortedSam.bam] OUTPUT=markDups.bam METRICS_FILE=metricsFile-markDups REMOVE_DUPLICATES=true ASSUME_SORTED=true PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 CREATE_INDEX=true    MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Thu Dec 10 06:18:20 UTC 2015] Executing as ?@f1dfb60b05e7 on Linux 4.0.9-boot2docker amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14; Picard version: 2.0.1() JdkDeflater
INFO    2015-12-10 06:18:20 MarkDuplicates  Start of doWork freeMemory: 75442592; totalMemory: 77135872; maxMemory: 1226506240
INFO    2015-12-10 06:18:20 MarkDuplicates  Reading input file and constructing read end information.
INFO    2015-12-10 06:18:20 MarkDuplicates  Will retain up to 4717331 data points before spilling to disk.
WARNING: BAM index file /tmp/job344921763_outputFiles/sortedSam.bam.bai is older than BAM /tmp/job344921763_outputFiles/sortedSam.bam
[Thu Dec 10 06:18:20 UTC 2015] picard.sam.markduplicates.MarkDuplicates done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=77135872
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Exception creating temporary directory.
    at htsjdk.samtools.util.IOUtil.createTempDir(IOUtil.java:768)
    at htsjdk.samtools.CoordinateSortedPairInfoMap.<init>(CoordinateSortedPairInfoMap.java:59)
    at picard.sam.markduplicates.util.DiskBasedReadEndsForMarkDuplicatesMap.<init>(DiskBasedReadEndsForMarkDuplicatesMap.java:57)
    at picard.sam.markduplicates.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:293)
    at picard.sam.markduplicates.MarkDuplicates.doWork(MarkDuplicates.java:139)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:209)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)
Caused by: java.io.IOException: No such file or directory
    at java.io.UnixFileSystem.createFileExclusively(Native Method)
    at java.io.File.createTempFile(File.java:2024)
    at java.io.File.createTempFile(File.java:2070)
    at htsjdk.samtools.util.IOUtil.createTempDir(IOUtil.java:759)
    ... 7 more
Error while running job: No matches for output file with glob: 'markDups.bam'
[job 4375347792] completed permanentFail
[job 4375347792] {}
Final process status is permanentFail
[job 4375347792] Removing temporary directory /Users/farahkhan/Google Drive/Galaxy_Cpipe_taverna/CWL/GATK-worflow/tools/tmpOutputrsy9YD
Workflow error:
  Process status is ['permanentFail']
Traceback (most recent call last):
  File "build/bdist.macosx-10.6-intel/egg/cwltool/main.py", line 457, in main
    select_resources=selectResources
  File "build/bdist.macosx-10.6-intel/egg/cwltool/main.py", line 169, in single_job_executor
    raise workflow.WorkflowException("Process status is %s" % (final_status))
WorkflowException: Process status is ['permanentFail']

Following is the link to gist for three files (cwl, json and docker). Any sorted bam file can work with this but here is the link to the bam file I used: https://www.dropbox.com/sh/731qmwkk49tcpvx/AABY9sN4EzDHE3HXHygL7k-ya?dl=0

Here is gist directory containing three files: https://gist.github.com/FarahZKhan/af5a47c39ddc95231988

here is the command I used to run the file:

cwltool   --debug  --tmpdir-prefix=$(pwd)/tmpOutput/ --tmp-outdir-prefix=$(pwd)/tmpOutput/ ./markDups.cwl ./markDups.json

where tmpOutput is a directory in the current working dir.

Any help will be appreciated. Thank you.

FarahZKhan commented 8 years ago

I solved it by adding TMP_DIR=/path/to/tmp/dir/in/current/working/dir.