Closed danbills closed 5 years ago
Specifically looking at the following: gvcf_joint
, prealign
, rnaseq
, somatic
, svcall
from https://github.com/bcbio/test_bcbio_cwl
Note that there's a version of somatic
with GS inputs available in the gcp
subdir which might make testing smoother for that one. I've seen prealign
work ok on PAPI2 but haven't had luck on anything else.
I'm seeing the detect_sv
tool in the somatic workflow fail with this error (from stderr):
[2018-11-04T19:02:19.372170Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] Failed to complete master workflow, error code: 1
[2018-11-04T19:02:19.372320Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] errorMessage:
[2018-11-04T19:02:19.373700Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] Unhandled Exception in TaskRunner-Thread-masterWorkflow
[2018-11-04T19:02:19.373750Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] Traceback (most recent call last):
[2018-11-04T19:02:19.373786Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] File "/usr/local/share/bcbio-nextgen/anaconda/share/manta-1.4.0-1/lib/python/pyflow/pyflow.py", line 1069, in run
[2018-11-04T19:02:19.373812Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] (retval, retmsg) = self._run()
[2018-11-04T19:02:19.373833Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] File "/usr/local/share/bcbio-nextgen/anaconda/share/manta-1.4.0-1/lib/python/pyflow/pyflow.py", line 1121, in _run
[2018-11-04T19:02:19.373871Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] self.workflow.workflow()
[2018-11-04T19:02:19.373894Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] File "/usr/local/share/bcbio-nextgen/anaconda/share/manta-1.4.0-1/lib/python/mantaWorkflow.py", line 895, in workflow
[2018-11-04T19:02:19.373930Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] graphTasks = runLocusGraph(self,dependencies=graphTaskDependencies)
[2018-11-04T19:02:19.373954Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] File "/usr/local/share/bcbio-nextgen/anaconda/share/manta-1.4.0-1/lib/python/mantaWorkflow.py", line 296, in runLocusGraph
[2018-11-04T19:02:19.373978Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] mergeTask = self.addTask(preJoin(taskPrefix,"mergeLocusGraph"),mergeCmd,dependencies=tmpGraphFileListTask,memMb=self.params.mergeMemMb)
[2018-11-04T19:02:19.374002Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] File "/usr/local/share/bcbio-nextgen/anaconda/share/manta-1.4.0-1/lib/python/pyflow/pyflow.py", line 3689, in addTask
[2018-11-04T19:02:19.374023Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] raise Exception("Task memory requirement exceeds full available resources")
[2018-11-04T19:02:19.374046Z] [2a8fea138cb7] [72_1] [WorkflowRunner] [ERROR] Exception: Task memory requirement exceeds full available resources
The cwl requests 4GB of memory for this task, which I verified Cromwell did request from PAPI as well:
resources: projectId: broad-dsde-cromwell-perf regions: [] virtualMachine: accelerators: [] bootDiskSizeGb: 21 bootImage: projects/cos-cloud/global/images/family/cos-stable cpuPlatform: '' disks: - name: local-disk sizeGb: 10 sourceImage: '' type: pd-ssd labels: cromwell-sub-workflow-name: wf-svcall-cwl cromwell-workflow-id: cromwell-0344f62e-809d-48d4-8e9a-ede11fe5dd5c wdl-call-alias: detect-sv wdl-task-name: detect-sv-cwl machineType: custom-2-4096
@chapmanb I was curious if you've seen this before ? I'm modifying the CWL to ask for a bit more memory but I'm wondering if there's something else that Cromwell is not doing right
Thanks much for testing this out. I'm happy to help with whatever I can for supporting this. I haven't seen this previously and am kind of surprised that it hits memory issues. This is a tiny test dataset so I'm not sure why it hits a 4Gb limit. It shouldn't use much memory at all.The error comes from within pyflow, which is an internal workflow system manta uses for running:
I wish it told us the memory it thought the system had and what it wants so we'd have more idea of what is happening.
I don't think Cromwell is doing anything wrong here and asking for more memory would be the first thing I'd try as well. Let me know if this doesn't fix and we can try to explore more. Thanks again.
Sounds good thanks ! I'll update here once I have more info.
In similar news I was able to run gvcf_joint to completion using the same inputs as in the gcp/somatic workflow (in the gs://bcbiodata/test_bcbio_cwl
bucket)
Nice one, glad you're having success with the gvcf_joint workflow. That has more parts and the svcaller one was meant to be simpler, so having that going is a good indication you've got most of the Cromwell parts in place. Really nice, I'm excited about having this going on GCP. Thanks again for all the work.
@chapmanb Somatic completed successfully by bumping the memory (I doubled it to 8GB) :)
I have another question about the rnaseq pipeline if you don't mind.
I'm hitting this error on the pipeline_summary
task:
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/cyvcf2/__init__.py:1: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from .cyvcf2 import (VCF, Variant, Writer, r_ as r_unphased, par_relatedness,
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/_libs/__init__.py:4: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from .tslib import iNaT, NaT, Timestamp, Timedelta, OutOfBoundsDatetime
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/__init__.py:26: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from pandas._libs import (hashtable as _hashtable,
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/core/dtypes/common.py:6: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from pandas._libs import algos, lib
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/core/util/hashing.py:7: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from pandas._libs import hashing, tslib
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/core/indexes/base.py:7: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from pandas._libs import (lib, index as libindex, tslib as libts,
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/tseries/offsets.py:21: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
import pandas._libs.tslibs.offsets as liboffsets
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/core/ops.py:16: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from pandas._libs import algos as libalgos, ops as libops
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/core/indexes/interval.py:32: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from pandas._libs.interval import (
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/core/internals.py:14: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from pandas._libs import internals as libinternals
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/core/sparse/array.py:33: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
import pandas._libs.sparse as splib
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/core/window.py:36: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
import pandas._libs.window as _window
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/core/groupby/groupby.py:68: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from pandas._libs import (lib, reduction,
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/core/reshape/reshape.py:30: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from pandas._libs import algos as _algos, reshape as _reshape
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/io/parsers.py:45: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
import pandas._libs.parsers as parsers
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/pandas/io/pytables.py:50: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
from pandas._libs import algos, lib, writers as libwriters
/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gffutils/interface.py:161: UserWarning: It appears that this database has not had the ANALYZE sqlite3 command run on it. Doing so can dramatically speed up queries, and is done by default for databases created with gffutils >0.8.7.1 (this database was created with version 0.8.2) Consider calling the analyze() method of this object.
"method of this object." % self.version)
Traceback (most recent call last):
File "/usr/local/bin/bcbio_nextgen.py", line 223, in <module>
runfn.process(kwargs["args"])
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/distributed/runfn.py", line 58, in process
out = fn(fnargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/utils.py", line 52, in wrapper
return apply(f, *args, **kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/distributed/multitasks.py", line 208, in pipeline_summary
return qcsummary.pipeline_summary(*args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/pipeline/qcsummary.py", line 70, in pipeline_summary
data["summary"] = _run_qc_tools(work_bam, work_data)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/pipeline/qcsummary.py", line 162, in _run_qc_tools
out = qc_fn(bam_file, data, cur_qc_dir)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/qc/qualimap.py", line 347, in run_rnaseq
metrics = _parse_metrics(metrics)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/qc/qualimap.py", line 210, in _parse_metrics
out.update({name: float(metrics[name])})
TypeError: float() argument must be a string or a number
This is what the command Cromwell generated looks like:
'bcbio_nextgen.py' 'runfn' 'pipeline_summary' 'cwl' 'sentinel_runtime=cores,2,ram,4096' 'sentinel_parallel=multi-parallel' 'sentinel_outputs=qcout_rec:summary__qc;summary__metrics;resources;description;reference__fasta__base;config__algorithm__coverage_interval;genome_build;genome_resources__rnaseq__transcripts;config__algorithm__tools_off;config__algorithm__qc;analysis;config__algorithm__tools_on;align_bam' 'sentinel_inputs=qc_rec:record' 'run_number=0'
And the cwl.inputs.json
:
{
"qc_rec": {
"genome_build": "hg19",
"config__algorithm__tools_on": [],
"align_bam": {
"nameext": ".bam",
"location": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/call-process_alignment/shard-0/align/Test1/Test1-sort.bam",
"path": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/call-process_alignment/shard-0/align/Test1/Test1-sort.bam",
"size": 4028452,
"dirname": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/call-process_alignment/shard-0/align/Test1",
"secondaryFiles": [
{
"nameext": ".bai",
"location": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/call-process_alignment/shard-0/align/Test1/Test1-sort.bam.bai",
"path": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/call-process_alignment/shard-0/align/Test1/Test1-sort.bam.bai",
"dirname": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/call-process_alignment/shard-0/align/Test1",
"secondaryFiles": [],
"basename": "Test1-sort.bam.bai",
"class": "File",
"nameroot": "Test1-sort.bam"
}
],
"basename": "Test1-sort.bam",
"class": "File",
"nameroot": "Test1-sort"
},
"description": "Test1",
"config__algorithm__tools_off": [],
"genome_resources__rnaseq__transcripts": {
"nameext": ".gtf",
"location": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/rnaseq/ref-transcripts.gtf",
"path": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/rnaseq/ref-transcripts.gtf",
"size": 15149,
"dirname": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/rnaseq",
"secondaryFiles": [
{
"nameext": ".db",
"location": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/rnaseq/ref-transcripts.gtf.db",
"path": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/rnaseq/ref-transcripts.gtf.db",
"dirname": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/rnaseq",
"secondaryFiles": [],
"basename": "ref-transcripts.gtf.db",
"class": "File",
"nameroot": "ref-transcripts.gtf"
}
],
"basename": "ref-transcripts.gtf",
"class": "File",
"nameroot": "ref-transcripts"
},
"reference__fasta__base": {
"nameext": ".fa",
"location": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa",
"path": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa",
"size": 37196,
"dirname": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/seq",
"secondaryFiles": [
{
"nameext": ".fai",
"location": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa.fai",
"path": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.fa.fai",
"dirname": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/seq",
"secondaryFiles": [],
"basename": "hg19.fa.fai",
"class": "File",
"nameroot": "hg19.fa"
},
{
"nameext": ".dict",
"location": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.dict",
"path": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/seq/hg19.dict",
"dirname": "/cromwell_root/tj-bcbio-papi/main-rnaseq.cwl/6c75cc7c-5515-45e0-9e5b-9a1b9e6fd2e1/call-qc_to_rec/bcbiodata/test_bcbio_cwl/testdata/genomes/hg19/seq",
"secondaryFiles": [],
"basename": "hg19.dict",
"class": "File",
"nameroot": "hg19"
}
],
"basename": "hg19.fa",
"class": "File",
"nameroot": "hg19"
},
"analysis": "RNA-seq",
"resources": "{\"default\":{\"cores\":1,\"jvm_opts\":[\"-Xms1000m\",\"-Xmx2048m\"],\"memory\":\"2048M\"}}",
"config__algorithm__qc": [
"qualimap_rnaseq"
],
"config__algorithm__coverage_interval": null
}
}
The only thing maybe off that I see is the config__algorithm__coverage_interval
(at the bottom of the json) being null
? Is this something that you'd expect not to be null
and could throw off the tool ?
Sorry about this. That's a bug in the qualimap parsing in bcbio that we've fixed (https://github.com/bcbio/bcbio-nextgen/commit/e15f787f984da3e5d727733f2a1d7c58c50c6be0) but hasn't yet been rolled into the Docker container. We're planning a release tomorrow so I can push a new Docker container as well which should fix the problem.
So I don't think this is a Cromwell issue but a bug on the bcbio side and if other workflows are good I'd skip it for now. Thanks again for all this testing.
No worries, thanks for the update, I'll skip this workflow for now then :)
@rebrown1395 this isn't done?
Need link to 3 specific workflows @geoffjentry . Belongs in different Q4 Milestone @ruchim