Open CooperStansbury opened 5 years ago
These are cases when reads are not aligned or listed as supplementary.
With regard to the homework, I said not to worry about read quality fully expecting these results. However, if one were so inclined to worry about them the code for process_bam
would look like this:
def process_bam(filename, sample_name, genome_positions = None):
with bs.AlignmentFile(filename) as bam:
for read in bam:
# Filter out unmapped and supplementary reads:
# https://broadinstitute.github.io/picard/explain-flags.html
if not read.flag & 2052:
# do your stuff here
return genome_positions
Again, I am doing this for simplicity's sake and not require any QC, but the above code would alleviate the issue you brought up.
@betteridiot Cool beans. I was just curious. This can be resolved.
Additionally, one could just do a hard check for read.pos >= 0
, but there are many definitions to read position and what read position is associated with.
I'm noticing that some of the reads from the
tumor.bam
file have aread.pos == -1
. As far as I can tell,read.pos
is defined whenbam.py
unpacks data (Exact line HERE):But I don't know enough to understand when this would return a -1 value. What conditions gove rise to -1 here?
Here's a test to reproduce this: