Open figueroakl opened 6 years ago
Looks like a bug in samtools.MPileupColumn(line)
used here https://github.com/VDBWRAIR/ngs_mapper/blob/92f866592ed02c3a79620fc83624dbd0184db312/ngs_mapper/bqd.py#L46 <-- this is where qualdepth.json gets messed up and here https://github.com/VDBWRAIR/ngs_mapper/blob/master/ngs_mapper/base_caller.py#L384 <-- where consensus/vcf get messed up
samtools mpileup
is only returning 67 rows, all with depth=1
The problem seems to be that almost all mappings are "anomalous read pairs" or "orphans" I'm not clear why that's happening with these samples, and with multi-entry reference fastas. This isn't a bug, as far as I can tell. see #112 #8
I'm going to look over the issues you and try to make sense of it.
Determined the reads are orphans; a little more info is in gitter. The solution is to make allowing orphans (-A
in samtools) a configurable option
Have to rework some functions, update where they are called, and pass the config around more...
base_caller: samtools.mpileup graphsample: samtools.nogap_mpileup bam_to_qualdepth: samtools.nogap_mpileup; bam.get_refstats; bam.alignment_info; tagreads: samtools.view (x2)-- can just add A=True stats_at_refpos: samtools.mpileup bqd is clear of system calls
tagreads - gets config, sort of base_caller gets config graphsample does not get config (?)
flagstats?
sent the directory I tested this on to gitter
MapvsUnmap reads does not seem to map the mapped reads percentage for some samples when running against a concatenated reference file? Could it be a sequential issue, such as, the reads got to the first reference in the file and that leads to the other references to be ignored?