bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
992 stars 354 forks source link

raise IOError("Missing CNVkit %s file: %s" % (ftype, files[ftype])) #915

Closed pengxiao78 closed 9 years ago

pengxiao78 commented 9 years ago

When I run the CNV structural variant calling in the tumor-normal pair analysis, the error as in the title appeared. And the following detailed the missing file: IOError: Missing CNVkit cnr file: /.../filename-sort.cnr Could you help me to figure it out? Thank you.

chapmanb commented 9 years ago

It looks like CNVkit failed for some reason, although I'm not able to provide any more suggestions without more information. This is a post-CNVkit run sanity check that the expected files were produced.

If you could post more of log/bcbio-nextgen-debug.log or provide more details about what is in the CNVkit directory that is failing we might be able to make a better guess.

Sorry about the problems and hope this helps some.

pengxiao78 commented 9 years ago

Hi Brad,

I checked /../work/structural/02-FFPE-tumor/cnvkit/ directory and found the following types of files, GRCh37-access.bed 15_2015-06-17_name_1-10-sort-callable-callableblocks-merged-annotated.bed 15_2015-06-17_name_1-10-sort-callable-callableblocks-merged.bed.gz.tbi 15_2015-06-17_name_1-10-sort-callable-callableblocks-merged.bed.gz 15_2015-06-17_name_1-10-sort-callable-callableblocks-merged.bed “raw” (folder) Under the “raw” folder, there are 02-normal_background.cnn 16_2015-06-17_name_1-10-sort.targetcoverage.cnn 16_2015-06-17_name_1-10-sort.antitargetcoverage.cnn 15_2015-06-17_name_1-10-sort-callable-callableblocks-merged-annotated.target.bed 15_2015-06-17_name_1-10-sort-callable-callableblocks-merged-annotated.antitarget.bed 15_2015-06-17_name_1-10-sort.targetcoverage.cnn 15_2015-06-17_name_1-10-sort-antitargetcoverage.cnn

The following are more detailed lines from the bcbio-nextgen.log file. For the IHPAA safety purpose, I masked the directory and renamed the batch and file names.

[2015-06-30T08:03Z] c2305.tusker.hcc.unl.edu: ipython: split_variants_by_sample [2015-06-30T08:03Z] c2305.tusker.hcc.unl.edu: Timing: prepped BAM merging [2015-06-30T08:03Z] c2305.tusker.hcc.unl.edu: ipython: delayed_bam_merge [2015-06-30T08:03Z] c2305.tusker.hcc.unl.edu: Timing: validation [2015-06-30T08:03Z] c2305.tusker.hcc.unl.edu: ipython: compare_to_rm [2015-06-30T08:03Z] c2305.tusker.hcc.unl.edu: Timing: ensemble calling [2015-06-30T08:03Z] c2305.tusker.hcc.unl.edu: ipython: combine_calls [2015-06-30T08:03Z] c2619.tusker.hcc.unl.edu: Ensemble consensus calls for batch01: mutect,freebayes,vardict,varscan [2015-06-30T08:03Z] c2613.tusker.hcc.unl.edu: Ensemble consensus calls for batch02: mutect,freebayes,vardict,varscan [2015-06-30T08:03Z] c2619.tusker.hcc.unl.edu: Ensemble consensus calls for batch03: mutect,freebayes,vardict,varscan [2015-06-30T08:03Z] c2613.tusker.hcc.unl.edu: Ensemble consensus calls for batch04: mutect,freebayes,vardict,varscan [2015-06-30T08:03Z] c2619.tusker.hcc.unl.edu: Ensemble consensus calls for batch05: mutect,freebayes,vardict,varscan [2015-06-30T08:03Z] c2613.tusker.hcc.unl.edu: Ensemble consensus calls for batch06: mutect,freebayes,vardict,varscan [2015-06-30T08:03Z] c2619.tusker.hcc.unl.edu: Ensemble consensus calls for batch07: mutect,freebayes,vardict,varscan [2015-06-30T08:03Z] c2613.tusker.hcc.unl.edu: Ensemble consensus calls for batch08: mutect,freebayes,vardict,varscan [2015-06-30T08:03Z] c2613.tusker.hcc.unl.edu: Ensemble consensus calls for batch09: mutect,freebayes,vardict,varscan [2015-06-30T08:03Z] c2619.tusker.hcc.unl.edu: Ensemble consensus calls for batch10: mutect,freebayes,vardict,varscan [2015-06-30T08:03Z] c2305.tusker.hcc.unl.edu: Timing: validation summary [2015-06-30T08:03Z] c2305.tusker.hcc.unl.edu: Timing: structural variation [2015-06-30T08:03Z] c2305.tusker.hcc.unl.edu: ipython: detect_sv [2015-06-30T11:08Z] c2613.tusker.hcc.unl.edu: Unexpected error Traceback (most recent call last): File "/.../bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/ipythontasks.py", line 39, in _setup_logging yield config File "/.../bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/ipythontasks.py", line 249, in detect_sv return ipython.zip_args(apply(structural.detect_sv, *args)) File "/.../bcbio/anaconda/lib/python2.7/site-packages/bcbio/structural/init.py", line 126, in detect_sv for svdata in _BATCH_CALLERSsvcaller: File "/.../bcbio/anaconda/lib/python2.7/site-packages/bcbio/structural/cnvkit.py", line 30, in run return _cnvkit_by_type(items, background) File "/.../bcbio/anaconda/lib/python2.7/site-packages/bcbio/structural/cnvkit.py", line 42, in _cnvkit_by_type return _run_cnvkit_cancer(items, background) File "/.../bcbio/anaconda/lib/python2.7/site-packages/bcbio/structural/cnvkit.py", line 90, in _run_cnvkit_cancer access_file, work_dir, background_name=paired.normal_name) File "/.../bcbio/anaconda/lib/python2.7/site-packages/bcbio/structural/cnvkit.py", line 172, in _run_cnvkit_shared raise IOError("Missing CNVkit %s file: %s" % (ftype, files[ftype])) IOError: Missing CNVkit cnr file: /.../work/structural/02-FFPE-tumor/cnvkit/raw/9_2015-06-17_name_1-10-sort.cnr

Thank you!

Peng From: Brad Chapman [mailto:notifications@github.com] Sent: Tuesday, June 30, 2015 8:26 PM To: chapmanb/bcbio-nextgen Cc: Xiao, Peng Subject: Re: [bcbio-nextgen] raise IOError("Missing CNVkit %s file: %s" % (ftype, files[ftype])) (#915)

It looks like CNVkit failed for some reason, although I'm not able to provide any more suggestions without more information. This is a post-CNVkit run sanity check that the expected files were produced.

If you could post more of log/bcbio-nextgen-debug.log or provide more details about what is in the CNVkit directory that is failing we might be able to make a better guess.

Sorry about the problems and hope this helps some.

— Reply to this email directly or view it on GitHubhttps://github.com/chapmanb/bcbio-nextgen/issues/915#issuecomment-117386286.

The information in this e-mail may be privileged and confidential, intended only for the use of the addressee(s) above. Any unauthorized use or disclosure of this information is prohibited. If you have received this e-mail by mistake, please delete it and immediately contact the sender.

chapmanb commented 9 years ago

Peng; Thanks for the detailed debugging information. It looks like CNVkit ran at least partially since you have some of the output files, but failed to produce the final cnr and cns files for some reason. Without an error message from CNVkit I'm not totally sure what to suggest. Your best bet would be to look in log/bcbio-nextgen-commands.log for the cnvkit run related to this sample. If you run that by hand, does it provide correct output or a useful error message to help debug? Beyond that, does the sample itself look okay if you look at alignment numbers in the BAM file?

Sorry to not have a definite answer but hope this helps.

etal commented 9 years ago

Sorry I didn't see this earlier. On some systems there seems to be a multiprocessing bug in CNVkit's "batch" command where the child processes fail to sync, and/or segmentation fails but the "batch" command continues and pretends to succeed, yet doesn't generate a .cns file, as we see here. I can't replicate this on Ubuntu platforms to diagnose and fix it myself, unfortunately. Also unhelpfully, the "batch" command in multiprocessing mode seems to eat the child process logging messages.

I've been recommending users who see multiprocessing problems to run CNVkit in serial mode (-p 1, instead of nproc). In bcbio-nextgen, for the short term it may be best to run cnvkit.py batch -p 1 ... even if multiple cores were specified in the configuration. Meanwhile I can look into having bcbio-nextgen call the CNVkit pipeline's sub-commands directly and parallelizing with IPython.parallel instead of the batch command, and/or write CommonWL tool descriptions (etal/cnvkit#39) to use similarly.

chapmanb commented 9 years ago

Eric -- thanks for connecting these dots for me. Apologies, I remember reading about the issue but hadn't applied it here. Peng -- I pushed fixes that will run CNVkit in single core mode until we can resolve the issue with multicore. If you update to the latest development -- bcbio_nextgen.py upgrade -u development and re-run it'll hopefully finish cleanly. Thanks much.