Open andreas-wilm opened 3 years ago
782755 T>TAG predicted by LF2.15 on BAM files processed by both versions:
$ zgrep 78275 *vcf.gz
NC_000912_Mpneumoniae_comb.lf215.lf215.vcf.gz:NC_000912.1 782755 . T TAG 589 PASS DP=100;AF=0.200000;SB=3;DP4=41,40,8,12;INDEL;HRUN=1
NC_000912_Mpneumoniae_comb.lf3.lf215.vcf.gz:NC_000912.1 782755 . T TAG 589 PASS DP=100;AF=0.200000;SB=3;DP4=41,40,8,12;INDEL;HRUN=1
No difference in BI values between the two BAM files:
diff -u <(samtools view NC_000912_Mpneumoniae_comb.lf3.bam NC_000912.1:782754-782755 | awk '$6 ~ /[DI]/' | grep -o 'BI:Z:[^[:space:]]*') <(samtools view NC_000912_Mpneumoniae_comb.lf215.bam NC_000912.1:782754-782755 | awk '$6 ~ /[DI]/' | grep -o 'BI:Z:[^[:space:]]*')
But the BI value there is actually zero i.e. "!":
samtools view NC_000912_Mpneumoniae_comb.lf3.bam NC_000912.1:782754-782755 | awk '$6 ~ /[DI]/' | grep -o 'BI:Z:[^[:space:]]*'
LF2.15 pileup doesn't show those values:
lofreq plpsummary -f $REFFA NC_000912_Mpneumoniae_comb.lf3.bam -r NC_000912.1:782754-782755 --call-indels
+AG IQ = 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44 44
+AG MQ = 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60 60
+AG AQ = 36 41 42 41 42 40 41 42 12 39 30 38 40 36 36 42 42 31 20 39
Checking which values LF2.15 reads from ai:Z, you can see that the zeros are placeholders for the actual insert and the insert quality is the one before. Why does this work for deletions then??
Looks like off by one error also for deletions. Unclear why this worked there. Needs full testing against LF215 on simulated data but see also https://github.com/andreas-wilm/lofreq3/issues/40
Deletions are working fine, e.g. NC_000912.1:782755 T>TAG
See /data/out/NC_000912_Mpneumoniae
Indel qual shows up as 0:
lofreq call -f $REFFA -b NC_000912_Mpneumoniae_comb.lf3.bam -r NC_000912.1:782754-782755 --loglevel 3 -p -P