Closed salpie closed 2 years ago
Thanks for the bug report. Can you try calling your data around this region (add -T chr1:9,713,165-9,721,858
). If you get the same error, would you be able to provide a BAM subsetted at this region?
Thanks for the swift response. Sadly, same error. Here's an edited region (I've changed the bases and it's in sam not bam - hope that's okay) output.sam.zip )
How was this data generated? The problem is caused by read pairs with duplicate names:
samtools view output.bam | grep "A00176:462:H23HNDMXY:2:2180:29324:1235"
A00176:462:H23HNDMXY:2:2180:29324:1235 97 chr1 9717513 60 49M chr12 107518128 0 AAGAAATGATTAAAAGGAAAAAAGAATTAATATGTGTAAATGTGGTAAA FFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFFFF NM:i:0 MD:Z:49 MA:Z:49M AS:i:49 XS:i:19 RG:Z:sample1
A00176:462:H23HNDMXY:2:2180:29324:1235 97 chr1 9717513 60 49M chr2 31183704 0 AAGAAATGATTAAAAGGAAAAAAGAATTAATATGTGTAAATGTGGTAAA FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:49 MA:Z:49M AS:i:49 XS:i:19 RG:Z:sample1
Read names in an Illumina library should be unique.
Ah yes, the duplicate read names was an issue we found in the fastqs to begin with (likely a genome centre bcl2fastq error) - and have been removing them, but it appears one escaped. Thank you for finding out what was wrong!
Hi there,
I ran Octopus in the cancer only mode and got this error:
i'm not sure what's happening as i've aligned and called mutations with the same reference and both have a chr present ( in the header and reads in the bam file) and in the whole .fa file.
bam file header snippet:
the offending region:
I'm using ## Version 0.7.4 and this was the command:
the reference used was GRCh38/ucsc/hg38.fa