Genomicus / bedtools

Automatically exported from code.google.com/p/bedtools
0 stars 0 forks source link

bamToBed cannot convert spliced alignments #4

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1.Try to convert a bam file that contains e.g. the following line:
HWI-EAS225:6:114:1253:343#0     16      10      3185842 255     56M6158N19M
    *       0       0      
CACGGATCGGCCACCAAGAGCTGGTGTTGGAGGCTGTGGATCTTCTCTGTGCACTGAATTATGGCCTTGAAACTG

VX[_VSS^U[[^N__YW[____a__a[_````aaa__a_a_``aaT`V]aa`_aXaaaaaaaabaaabbaa`baa
   NM:i:0  XS:A:+  NS:i:0

This is standard .sam/.bam format. This alignment contains a large
insertion, as indicated by the CIGAR string.

What is the expected output? What do you see instead?

I would expect a single .bed line with the start/end coordinates of the
spliced read

Instead, the following error message is produced:

terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr
Aborted

What version of the product are you using? On what operating system?
Linux 64 bit, Bedtools version 2.0.1

Please provide any additional information below.
Thank you for developing this nice toolkit !

Original issue reported on code.google.com by tomsi...@gmail.com on 31 Jan 2010 at 2:34

GoogleCodeExporter commented 9 years ago
Thanks for bringing this up.  It appears to be a potential misunderstanding of 
the
SAM spec for "N" edits.  The author of the BamTools API (which BEDTools uses 
for BAM
support) are looking into this.  Hopefully a resolution will be available soon.

Thanks so much,
Aaron Quinlan

Original comment by aaronqui...@gmail.com on 31 Jan 2010 at 9:06

GoogleCodeExporter commented 9 years ago
This was indeed a misunderstanding related to the meaning of the "N" operation 
in the
CIGAR string.  Version 2.5.2 has been posted and should address your problem.

Thanks again for pointing this out.
Aaron

Original comment by aaronqui...@gmail.com on 3 Feb 2010 at 2:27