Closed ionox0 closed 6 years ago
@ionox0 If stderr output is important then I'd make that a separate output file
outputs:
log: stderr
bams:
type: File[]
outputBinding:
glob: '*_IR.bam'
Additionally you have many hard coded paths, which is fine to get started, but will need to be removed to make this CWL description portable :-)
Toil v3.15.0 was just released; I wonder if that helps?
@mr-c Yes you're right, that method does give me a new log file with the error logs from the tool, however this file also skips over the initial logging statements that I was expecting, and has the same logs as are logged in the console. I'll try upgrading to the new version of Toil, and get back with any changes, thanks again for the help!
log output file:
luna.cbio.private 2018-03-28 11:24:42,593 Thread-323 DEBUG toil.statsAndLogging: Received Toil worker log. Disable debug level logging to hide this output
2018-03-28 11:24:42,593 - toil.statsAndLogging - DEBUG - Received Toil worker log. Disable debug level logging to hide this output
luna.cbio.private 2018-03-28 11:24:42,593 Thread-323 DEBUG toil.statsAndLogging: 'Innovation-Pipeline/cwl_tools/abra/abra__debub.cwl' /opt/common/CentOS_6/java/jdk1.8.0_25/bin/java B/H/jobZGC5L8 ativeAssembler Region: 14_105240223_105240343 Length: 120 ReadCount: 0 Elapsed 5 Assembled false 43
20
...
logged to console:
luna.cbio.private 2018-03-28 11:24:42,593 Thread-323 DEBUG toil.statsAndLogging: Received Toil worker log. Disable debug level logging to hide this output
2018-03-28 11:24:42,593 - toil.statsAndLogging - DEBUG - Received Toil worker log. Disable debug level logging to hide this output
luna.cbio.private 2018-03-28 11:24:42,593 Thread-323 DEBUG toil.statsAndLogging: 'Innovation-Pipeline/cwl_tools/abra/abra__debub.cwl' /opt/common/CentOS_6/java/jdk1.8.0_25/bin/java B/H/jobZGC5L8 ativeAssembler Region: 14_105240223_105240343 Length: 120 ReadCount: 0 Elapsed 5 Assembled false 43
2018-03-28 11:24:42,593 - toil.statsAndLogging - DEBUG - 'Innovation-Pipeline/cwl_tools/abra/abra__debub.cwl' /bin/java B/H/jobZGC5L8 ativeAssembler Region: 14_105240223_105240343 Length: 120 ReadCount: 0 Elapsed 5 Assembled false 43
^ Note how it begins right in the middle of the word "ativeAssembler Region:", which is somehow getting cut off and should say "Elapsed_msecs_in_NativeAssembler Region:"
Here are the lines that I would expect to see, as seen with GATK's FindCoveredIntervals tool:
Received Toil worker log. Disable debug level logging to hide this output
Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD DEBUG:toil.worker:Next available file descriptor: 5
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD INFO:toil:Running Toil version 3.14.0-b91dbf9bf6116879952f0a70f9a2fbbcae7e51b6.
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl'
/bin/java Z/J/job271AFD DEBUG:toil:Configuration: {'rescueJobsFrequency': 3600, 'logLevel': 'DEBUG', 'minNodes': None, 'maxMemory': 9223372036854775807, 'jobStore': 'file:/ifs/work/bergerm1/Innovation/sandbox/ian/DY_03-26_55/tmp/jobstore-ff864d84-3137-11e8-a6bf-645106ef9e4c', 'disableHotDeployment': False, 'linkImports': True, 'nodeOptions': None, 'nodeTypes': [], 'servicePollingInterval': 60, 'workDir': 'outputFolder/tmp', 'stats': True, 'disableCaching': True, 'maxPreemptableServiceJobs': 9223372036854775807, 'environment': {}, 'parasolMaxBatches': 10000, 'cleanWorkDir': 'onSuccess', 'maxCores': 9223372036854775807, 'sseKey': None, 'maxDisk': 9223372036854775807, 'scaleInterval': 30, 'deadlockWait': 60, 'defaultPreemptable': False, 'clusterStats': None, 'defaultCores': 1, 'cseKey': None, 'betaInertia': 1.2, 'metrics': False, 'maxNodes': [10], 'scale': 1, 'writeLogs': 'outputFolder/log', 'badWorker': 0.0, 'defaultDisk': 10737418240, 'mesosMasterAddress': '140.163.153.177:5050', 'restart': False, 'useAsync': True, 'preemptableCompensation': 0.0, 'parasolCommand': 'parasol', 'workflowID': 'c793845f-0454-4e8c-9478-a1e174148e3e', 'alphaPacking': 0.8, 'maxServiceJobs': 9223372036854775807, 'readGlobalFileMutableByDefault': False, 'badWorkerFailInterval': 0.01, 'statePollingWait': 1, 'debugWorker': False, 'maxLogFileSize': 64000, 'defaultMemory': 10737418240, 'workflowAttemptNumber': 0, 'maxJobDuration': 9223372036854775807, 'clean': 'never', 'provisioner': None, 'batchSystem': 'lsf', 'retryCount': 1, 'writeLogsGzip': None, 'nodeStorage': 50}
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD DEBUG:toil.worker:Parsed job wrapper
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD DEBUG:toil.worker:Got a command to run: _toil w/W/jobdOspz7/g/tmpTgo3Hs.tmp virtualenv/lib/python2.7/site-packages toil.cwl.cwltoil True
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD DEBUG:toil.job:Loading user module ModuleDescriptor(dirPath='virtualenv/lib/python2.7/site-packages', name='toil.cwl.cwltoil', fromVirtualEnv=True).
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD DEBUG:rdflib:RDFLib Version: 4.2.2
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl'
/bin/java Z/J/job271AFD [job FindCoveredIntervals.cwl] outputFolder/out_tmpdirRg6Xzd$ /bin/sh \
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD -c \
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD '/bin/java' '-Xmx20g' '-Djava.io.tmpdir=/scratch' '-jar' 'GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar' '-T' 'FindCoveredIntervals' '--input_file' 'outputFolder/DY_03-26_55/tmpTGy4T2/stg5230694e-4c6a-4fc6-8943-b48408b6bbd3/MSK-L-009-bc_IGO_05500_DY_6_S3_aln_RG_MD.bam' '--reference_sequence' '/data/pubdata/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta' '--input_file' 'outputFolder/tmpTGy4T2/stgbb0fa913-67ab-4736-83a5-a33f224912ca/MSK-L-009-cf_IGO_05500_DY_23_S22_aln_RG_MD.bam' '--coverage_threshold' '20' '--minBaseQuality' '20' '--minMappingQuality' '20' '--out' 'MSK-L-009.fci.list' --read_filter FailsVendorQualityCheck -rf BadMate -rf UnmappedRead -rf BadCigar
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' bin/java Z/J/job271AFD INFO:cwltool:[job FindCoveredIntervals.cwl] outputFolder/out_tmpdirRg6Xzd$ /bin/sh \
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD -c \
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD '/bin/java' '-Xmx20g' '-Djava.io.tmpdir=/scratch' '-jar' '/GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar' '-T' 'FindCoveredIntervals' '--input_file' 'outputFolder/tmpTGy4T2/stg5230694e-4c6a-4fc6-8943-b48408b6bbd3/MSK-L-009-bc_IGO_05500_DY_6_S3_aln_RG_MD.bam' '--reference_sequence' 'data/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta' '--input_file' 'data/tmpTGy4T2/stgbb0fa913-67ab-4736-83a5-a33f224912ca/MSK-L-009-cf_IGO_05500_DY_23_S22_aln_RG_MD.bam' '--coverage_threshold' '20' '--minBaseQuality' '20' '--minMappingQuality' '20' '--out' 'MSK-L-009.fci.list' --read_filter FailsVendorQualityCheck -rf BadMate -rf UnmappedRead -rf BadCigar
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl'
/opt/common/CentOS_6/java/jdk1.8.0_25/bin/java Z/J/job271AFD INFO 03:52:33,052 HelpFormatter - --------------------------------------------------------------------------------
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD INFO 03:52:33,059 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.3-0-g37228af, Compiled 2014/10/24 01:07:22
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD INFO 03:52:33,060 HelpFormatter - Copyright (c) 2010 The Broad Institute
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD INFO 03:52:33,064 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD INFO 03:52:33,070 HelpFormatter - Program Args: -T FindCoveredIntervals --input_file outputFolder/tmpTGy4T2/stg5230694e-4c6a-4fc6-8943-b48408b6bbd3/MSK-L-009-bc_IGO_05500_DY_6_S3_aln_RG_MD.bam --reference_sequence data/hg-fasta/VERSIONS/hg19/Homo_sapiens_assembly19.fasta --input_file outputFolder/tmpTGy4T2/stgbb0fa913-67ab-4736-83a5-a33f224912ca/MSK-L-009-cf_IGO_05500_DY_23_S22_aln_RG_MD.bam --coverage_threshold 20 --minBaseQuality 20 --minMappingQuality 20 --out MSK-L-009.fci.list --read_filter FailsVendorQualityCheck -rf BadMate -rf UnmappedRead -rf BadCigar
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD INFO 03:52:33,076 HelpFormatter - Executing as johnsoni@s20.cbio.private on Linux 2.6.32-696.3.2.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_25-b17.
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD INFO 03:52:33,076 HelpFormatter - Date/Time: 2018/03/27 03:52:33
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD INFO 03:52:33,077 HelpFormatter - --------------------------------------------------------------------------------
'Innovation-Pipeline/cwl_tools/gatk/FindCoveredIntervals.cwl' /bin/java Z/J/job271AFD INFO 03:52:33,078 HelpFormatter - --------------------------------------------------------------------------------
...
I have not been able to reproduce this issue, but it has not occurred since upgrading to Toil 3.15.0
It may still be caused by a threading issue, but can close this until I see it again
I realize now that this was being caused by the --maxLogFileSize
default of 64000 (logs were being truncate to last 128,000) bytes, for some reason 2X the size of this param. Thank you
Hello Toil team,
This is a follow up from https://github.com/common-workflow-language/cwltool/issues/703
I'm still seeing this issue after upgrading
cwltool
to1.0.20180326152342
The issue is that the logs from this abra tool do not always get logged into the cwltoil.log file. Specifically I'm missing the beginning of the logs.
I linked to the above^ issue to point out the fact that the 0.92 version of the abra tool which we are currently using logs to standard error not stdout, which may be relevant.
This is the section of the logs that is missing (which I generated by using
echo
as the baseCommand instead of the abra jar): https://gist.github.com/ionox0/d79acf464f7a2497ccfd270aecd8c5ebVersions:
cwltool version: 1.0.20180326152342 Toil version: 3.14.0
Script used to run
toil-cwl-runner
inputs.yaml
abra.cwl