Open colindaven opened 3 years ago
Quantified:
(base) rcug@hpc-rc09:/ngsssd1/rcug/wochenende_test/pe38/2x38$ samtools view 9_S73_2_R1.trm.ns.fix.s.dup.mm.mq30.calmd.bam | grep NM:i:1 | wc -l
70861
(base) rcug@hpc-rc09:/ngsssd1/rcug/wochenende_test/pe38/2x38$ samtools view 9_S73_2_R1.trm.ns.fix.s.dup.mm.mq30.calmd.bam | grep NM:i:0 | wc -l
508919
(base) rcug@hpc-rc09:/ngsssd1/rcug/wochenende_test/pe38/2x38$ samtools view 9_S73_2_R1.trm.ns.fix.s.dup.mm.mq30.calmd.bam | grep NM:i:2 | wc -l
97
(base) rcug@hpc-rc09:/ngsssd1/rcug/wochenende_test/pe38/2x38$ samtools view 9_S73_2_R1.trm.ns.fix.s.dup.mm.mq30.calmd.bam | grep NM:i:3 | wc -l
30
Very, very few reads, but some reads do slip through the bamtools filter, despite have more mismatches in the NM field than allowed. eg 1 read with 5 mismatches from 1x76bp data
with 2x38bp data the problem is worse. Yet the total number of reads is so low (<100 generally) it has no effect on results.