etal / cnvkit

Copy number variant detection from targeted DNA sequencing
http://cnvkit.readthedocs.org
Other
547 stars 165 forks source link

ValueError in coverage subcommand #620

Open tea-kostic opened 3 years ago

tea-kostic commented 3 years ago

Hi!

We are getting the following error with CNVkit coverage subcommand:

"""
Traceback (most recent call last):
   File "/usr/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
      r = call_item.fn(*call_item.args, **call_item.kwargs)
   File "/usr/lib/python3.8/concurrent/futures/process.py", line 198, in _process_chunk
      return [fn(*args) for args in chunk]
   File "/usr/lib/python3.8/concurrent/futures/process.py", line 198, in <listcomp>
      return [fn(*args) for args in chunk]
   File "/usr/local/lib/python3.8/dist-packages/cnvlib/coverage.py", line 188, in _bedcov
      table = bedcov(*args)
   File "/usr/local/lib/python3.8/dist-packages/cnvlib/coverage.py", line 209, in bedcov
      raise ValueError("BED file %r chromosome names don't match any in "
ValueError: BED file 'tmp.188.us5h9hfa.bed' chromosome names don't match any in BAM file 'dc1.normal.bam'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
   File "/usr/local/bin/cnvkit.py", line 9, in <module>
      args.func(args)
   File "/usr/local/lib/python3.8/dist-packages/cnvlib/commands.py", line 462, in _cmd_coverage
      pset = coverage.do_coverage(args.interval, args.bam_file, args.count,
   File "/usr/local/lib/python3.8/dist-packages/cnvlib/coverage.py", line 27, in do_coverage
      cnarr = interval_coverages(bed_fname, bam_fname, by_count, min_mapq,
   File "/usr/local/lib/python3.8/dist-packages/cnvlib/coverage.py", line 57, in interval_coverages
      table = interval_coverages_pileup(bed_fname, bam_fname, min_mapq,
   File "/usr/local/lib/python3.8/dist-packages/cnvlib/coverage.py", line 165, in interval_coverages_pileup
      for bed_chunk_fname, table in pool.map(_bedcov, args_iter):
   File "/usr/lib/python3.8/concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists
      for element in iterable:
   File "/usr/lib/python3.8/concurrent/futures/_base.py", line 611, in result_iterator
      yield fs.pop().result()
   File "/usr/lib/python3.8/concurrent/futures/_base.py", line 432, in result
      return self.__get_result()
   File "/usr/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
      raise self._exception
ValueError: BED file 'tmp.188.us5h9hfa.bed' chromosome names don't match any in BAM file 'dc1.normal.bam'

We have been using CNVkit v0.9.3 up to now, and wanted to upgrade to v0.9.8. We've updated the docker image as suggested here, and wanted to test it using the same files, that ran perfectly in v0.9.3, however we're getting the issue described above. The command line looks exactly the same as the run with v0.9.3 and all input files are consistent with chromosome naming.

What is causing this issue and how can we resolve it?

Thank you in advance! Tea

tetedange13 commented 3 years ago

Hi @tea-kostic ,

Not an author of CNVkit, but could you please share exact parameters you used with your cnvkit.py coverage, for each different CNVkit release ? (or the one command you used for both releases maybe ?)

As I understand this error, it comes from parallel processing of input BED, where it is splitted (at L164) into several bed_chunks that are processed in a parallelized way => One of these bed_chunks files (called "tmp.188.us5h9hfa.bed" in your case) seems to be problematic, causing this condition to be found True => That is why you get chromosome names don't match any in BAM file ValueError, even though you have consistent chromosome naming => To confirm my hint, could you try to run your command but with --processes 1 ?

Hope this helps. Have a nice day. Felix.

tea-kostic commented 3 years ago

Hi @tetedange13 ,

This is the command we used for both releases: cnvkit.py coverage dc1.normal.bam human_g1k_v37_decoy.breakpoints.target.bed --processes 4 --output dc1.normal.targetcoverage.cnn

Running the command with --processes 1 was successful. I am wondering why is this error happening with the new release and not with the previous release we used.

Thanks a lot! Tea

tetedange13 commented 3 years ago

Hi @tea-kostic ,

Yes could be a bug here, I have further questions if you do not mind:

tea-kostic commented 3 years ago

Hi @tetedange13 ,

This is the BED file we are using : human_g1k_v37_decoy.breakpoints.txt However, we've tried some other BED files as well and the same problem occurs. We'll try debugging as you suggested and let you know what happens.

Thanks Felix! Tea

tetedange13 commented 3 years ago

Thanks a bunch for sharing your BED @tea-kostic , I could reproduce the bug on my machine

Edit:

Hope this helps ! Have a nice day. Felix.

tetedange13 commented 3 years ago

Hi @tea-kostic,

Did you see my answer? (sorry I edited instead of creating a new one, so maybe you missed it) => Still having a problem? Or was it due to "unplaced contigs" absent from your BAM but present in your BED, as I suspected

Thanks for your answer. Kind regards. Felix.

tea-kostic commented 3 years ago

Hi @tetedange13

Sorry, I haven't seen your edited comment.

Here is the header of the BAM file:

@HD VN:1.0 SO:coordinate GO:none @SQ SN:1 LN:249250621 AS:GRCh37 M5:1b22b98cdeb4a9304cb5d48026a85128 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:2 LN:243199373 AS:GRCh37 M5:a0d9851da00400dec1098a9255ac712e UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:3 LN:198022430 AS:GRCh37 M5:fdfd811849cc2fadebc929bb925902e5 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:4 LN:191154276 AS:GRCh37 M5:23dccd106897542ad87d2765d28a19a1 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:5 LN:180915260 AS:GRCh37 M5:0740173db9ffd264d728f32784845cd7 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:6 LN:171115067 AS:GRCh37 M5:1d3a93a248d92a729ee764823acbbc6b UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:7 LN:159138663 AS:GRCh37 M5:618366e953d6aaad97dbe4777c29375e UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:8 LN:146364022 AS:GRCh37 M5:96f514a9929e410c6651697bded59aec UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:9 LN:141213431 AS:GRCh37 M5:3e273117f15e0a400f01055d9f393768 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:10 LN:135534747 AS:GRCh37 M5:988c28e000e84c26d552359af1ea2e1d UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:11 LN:135006516 AS:GRCh37 M5:98c59049a2df285c76ffb1c6db8f8b96 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:12 LN:133851895 AS:GRCh37 M5:51851ac0e1a115847ad36449b0015864 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:13 LN:115169878 AS:GRCh37 M5:283f8d7892baa81b510a015719ca7b0b UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:14 LN:107349540 AS:GRCh37 M5:98f3cae32b2a2e9524bc19813927542e UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:15 LN:102531392 AS:GRCh37 M5:e5645a794a8238215b2cd77acb95a078 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:16 LN:90354753 AS:GRCh37 M5:fc9b1a7b42b97a864f56b348b06095e6 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:17 LN:81195210 AS:GRCh37 M5:351f64d4f4f9ddd45b35336ad97aa6de UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:18 LN:78077248 AS:GRCh37 M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:19 LN:59128983 AS:GRCh37 M5:1aacd71f30db8e561810913e0b72636d UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:20 LN:63025520 AS:GRCh37 M5:0dec9660ec1efaaf33281c0d5ea2560f UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:21 LN:48129895 AS:GRCh37 M5:2979a6085bfe28e3ad6f552f361ed74d UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:22 LN:51304566 AS:GRCh37 M5:a718acaa6135fdca8357d5bfe94211dd UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:X LN:155270560 AS:GRCh37 M5:7e0e2e580297b7764e31dbc80c2540dd UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:Y LN:59373566 AS:GRCh37 M5:1fa3474750af0948bdf97d5a0ee52e51 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:MT LN:16569 AS:GRCh37 M5:c68f52674c9fb33aef52dcf399755519 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000207.1 LN:4262 AS:GRCh37 M5:f3814841f1939d3ca19072d9e89f3fd7 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000226.1 LN:15008 AS:GRCh37 M5:1c1b2cd1fccbc0a99b6a447fa24d1504 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000229.1 LN:19913 AS:GRCh37 M5:d0f40ec87de311d8e715b52e4c7062e1 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000231.1 LN:27386 AS:GRCh37 M5:ba8882ce3a1efa2080e5d29b956568a4 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000210.1 LN:27682 AS:GRCh37 M5:851106a74238044126131ce2a8e5847c UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000239.1 LN:33824 AS:GRCh37 M5:99795f15702caec4fa1c4e15f8a29c07 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000235.1 LN:34474 AS:GRCh37 M5:118a25ca210cfbcdfb6c2ebb249f9680 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000201.1 LN:36148 AS:GRCh37 M5:dfb7e7ec60ffdcb85cb359ea28454ee9 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000247.1 LN:36422 AS:GRCh37 M5:7de00226bb7df1c57276ca6baabafd15 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000245.1 LN:36651 AS:GRCh37 M5:89bc61960f37d94abf0df2d481ada0ec UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000197.1 LN:37175 AS:GRCh37 M5:6f5efdd36643a9b8c8ccad6f2f1edc7b UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000203.1 LN:37498 AS:GRCh37 M5:96358c325fe0e70bee73436e8bb14dbd UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000246.1 LN:38154 AS:GRCh37 M5:e4afcd31912af9d9c2546acf1cb23af2 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000249.1 LN:38502 AS:GRCh37 M5:1d78abec37c15fe29a275eb08d5af236 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000196.1 LN:38914 AS:GRCh37 M5:d92206d1bb4c3b4019c43c0875c06dc0 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000248.1 LN:39786 AS:GRCh37 M5:5a8e43bec9be36c7b49c84d585107776 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000244.1 LN:39929 AS:GRCh37 M5:0996b4475f353ca98bacb756ac479140 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000238.1 LN:39939 AS:GRCh37 M5:131b1efc3270cc838686b54e7c34b17b UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000202.1 LN:40103 AS:GRCh37 M5:06cbf126247d89664a4faebad130fe9c UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000234.1 LN:40531 AS:GRCh37 M5:93f998536b61a56fd0ff47322a911d4b UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000232.1 LN:40652 AS:GRCh37 M5:3e06b6741061ad93a8587531307057d8 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000206.1 LN:41001 AS:GRCh37 M5:43f69e423533e948bfae5ce1d45bd3f1 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000240.1 LN:41933 AS:GRCh37 M5:445a86173da9f237d7bcf41c6cb8cc62 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000236.1 LN:41934 AS:GRCh37 M5:fdcd739913efa1fdc64b6c0cd7016779 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000241.1 LN:42152 AS:GRCh37 M5:ef4258cdc5a45c206cea8fc3e1d858cf UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000243.1 LN:43341 AS:GRCh37 M5:cc34279a7e353136741c9fce79bc4396 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000242.1 LN:43523 AS:GRCh37 M5:2f8694fc47576bc81b5fe9e7de0ba49e UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000230.1 LN:43691 AS:GRCh37 M5:b4eb71ee878d3706246b7c1dbef69299 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000237.1 LN:45867 AS:GRCh37 M5:e0c82e7751df73f4f6d0ed30cdc853c0 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000233.1 LN:45941 AS:GRCh37 M5:7fed60298a8d62ff808b74b6ce820001 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000204.1 LN:81310 AS:GRCh37 M5:efc49c871536fa8d79cb0a06fa739722 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000198.1 LN:90085 AS:GRCh37 M5:868e7784040da90d900d2d1b667a1383 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000208.1 LN:92689 AS:GRCh37 M5:aa81be49bf3fe63a79bdc6a6f279abf6 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000191.1 LN:106433 AS:GRCh37 M5:d75b436f50a8214ee9c2a51d30b2c2cc UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000227.1 LN:128374 AS:GRCh37 M5:a4aead23f8053f2655e468bcc6ecdceb UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000228.1 LN:129120 AS:GRCh37 M5:c5a17c97e2c1a0b6a9cc5a6b064b714f UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000214.1 LN:137718 AS:GRCh37 M5:46c2032c37f2ed899eb41c0473319a69 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000221.1 LN:155397 AS:GRCh37 M5:3238fb74ea87ae857f9c7508d315babb UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000209.1 LN:159169 AS:GRCh37 M5:f40598e2a5a6b26e84a3775e0d1e2c81 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000218.1 LN:161147 AS:GRCh37 M5:1d708b54644c26c7e01c2dad5426d38c UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000220.1 LN:161802 AS:GRCh37 M5:fc35de963c57bf7648429e6454f1c9db UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000213.1 LN:164239 AS:GRCh37 M5:9d424fdcc98866650b58f004080a992a UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000211.1 LN:166566 AS:GRCh37 M5:7daaa45c66b288847b9b32b964e623d3 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000199.1 LN:169874 AS:GRCh37 M5:569af3b73522fab4b40995ae4944e78e UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000217.1 LN:172149 AS:GRCh37 M5:6d243e18dea1945fb7f2517615b8f52e UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000216.1 LN:172294 AS:GRCh37 M5:642a232d91c486ac339263820aef7fe0 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000215.1 LN:172545 AS:GRCh37 M5:5eb3b418480ae67a997957c909375a73 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000205.1 LN:174588 AS:GRCh37 M5:d22441398d99caf673e9afb9a1908ec5 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000219.1 LN:179198 AS:GRCh37 M5:f977edd13bac459cb2ed4a5457dba1b3 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000224.1 LN:179693 AS:GRCh37 M5:d5b2fc04f6b41b212a4198a07f450e20 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000223.1 LN:180455 AS:GRCh37 M5:399dfa03bf32022ab52a846f7ca35b30 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000195.1 LN:182896 AS:GRCh37 M5:5d9ec007868d517e73543b005ba48535 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000212.1 LN:186858 AS:GRCh37 M5:563531689f3dbd691331fd6c5730a88b UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000222.1 LN:186861 AS:GRCh37 M5:6fe9abac455169f50470f5a6b01d0f59 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000200.1 LN:187035 AS:GRCh37 M5:75e4c8d17cd4addf3917d1703cacaf25 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000193.1 LN:189789 AS:GRCh37 M5:dbb6e8ece0b5de29da56601613007c2a UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000194.1 LN:191469 AS:GRCh37 M5:6ac8f815bf8e845bb3031b73f812c012 UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000225.1 LN:211173 AS:GRCh37 M5:63945c3e6962f28ffd469719a747e73c UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:GL000192.1 LN:547496 AS:GRCh37 M5:325ba9e808f669dfeee210fdd7b470ac UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Homo Sapiens @SQ SN:NC_007605 LN:171823 AS:NC_007605.1 M5:6743bd63b3ff2b5b8985d8933c53290a UR:http://www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta SP:Epstein-Barr virus @RG ID:C09DF.1 SM:synthetic.challenge.set1.normal LB:Solexa-76163 PU:C09DFACXX111207.1.TTGAGCCT CN:BI DT:2011-12-07T00:00:00-0500 PL:illumina @RG ID:C09DF.2 SM:synthetic.challenge.set1.normal LB:Solexa-76163 PU:C09DFACXX111207.2.TTGAGCCT CN:BI DT:2011-12-07T00:00:00-0500 PL:illumina @RG ID:D0EN0.4 SM:synthetic.challenge.set1.normal LB:Solexa-76163 PU:D0EN0ACXX111207.4.TTGAGCCT CN:BI DT:2011-12-07T00:00:00-0500 PL:illumina @RG ID:D0EN0.7 SM:synthetic.challenge.set1.normal LB:Solexa-76163 PU:D0EN0ACXX111207.7.TTGAGCCT CN:BI DT:2011-12-07T00:00:00-0500 PL:illumina @RG ID:D0EN0.8 SM:synthetic.challenge.set1.normal LB:Solexa-76163 PU:D0EN0ACXX111207.8.TTGAGCCT CN:BI DT:2011-12-07T00:00:00-0500 PL:illumina @CO aggregation_version=1

Best regards, Tea

tetedange13 commented 3 years ago

Hi @tea-kostic ,

Thanks for sharing your BAM, header => Indeed you have some "unplaced contigs" in your BAM, so you should not have this error... => However as I said it truely seems related to unplaced contigs absence/presence, but frankly I ran out of ideas...

This BED file: tmp-2253n2254.target.bed.txt => Was obtained via a batch on BED you shared, then I took only 1st problematic chunk + the one right before => As default chunk_size is 5000, it has 10K rows total => If anyone else could run cnvkit.py coverage bam_without_unplaced_contigs.bam tmp-2253n2254.target.bed.txt -o sample.target.cnn, they should get:

Hope this helps. Have a nice day. Felix.

tskir commented 3 years ago

@tetedange13 Thank you so much for looking into this, it's all really helpful!

@tea-kostic I've tried running this using various synthetic files with GL/NC and other contig names, but unfortunately I am unable to reproduce this specific issue (namely, threading-dependent failure with all contig names being correctly present and indexed in the BAM file) no matter what combination of parameters I try.

Could you please share your BAM file so that I could reproduce and fix this? If the file cannot be shared publicly, you can email the link privately to ktsukanov [at] ebi.ac.uk.