hammerlab / biokepi

Bioinformatics Ketrew Pipelines
Apache License 2.0
27 stars 4 forks source link

Add support for cleaning a BAM via Picard's CleanSam #441

Closed armish closed 7 years ago

armish commented 7 years ago

Useful workaround for the following issue with badly constructed BAM files:

Exception in thread "main" htsjdk.samtools.SAMFormatException: SAM validation error: ERROR: Read name HWI-D00419_209:1:1102:1640:75904, Read CIGAR M operator maps off end of reference
    at htsjdk.samtools.SAMUtils.processValidationErrors(SAMUtils.java:439)
    at htsjdk.samtools.BAMRecord.getCigar(BAMRecord.java:247)
    at htsjdk.samtools.SAMRecord.getAlignmentEnd(SAMRecord.java:460)
    at htsjdk.samtools.SAMRecord.computeIndexingBin(SAMRecord.java:1235)
    at htsjdk.samtools.SAMRecord.isValid(SAMRecord.java:1643)
    at htsjdk.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:642)
    at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:628)
    at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:598)
    at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:515)
    at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:489)
    at picard.sam.SamToFastq.doWork(SamToFastq.java:158)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:187)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:89)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:99)