Open Rohit-Satyam opened 2 years ago
It would be helpful to have a small test case to reproduce the problem and improve the code.
In the meantime, this branch has a new mpileup option --indels-2.0 which is a testing ground for improved handling of indels https://github.com/samtools/bcftools/tree/indel-revamp. Maybe you could try it on your data to see if it solved the problem. Note that this is an EXPERIMENTAL feature. If you decide to test it, please let us know the results.
I can share the BAM file that can serve as a test case. Note: This is ISeq data. I will try the recommendations you made. 16_S16_L001.dedup.zip
OK, I checked and can confirm that the read counts are not correct with any version. I'll add this to the test suite to investigate later.
By the way, it's strange that the reads have 20 bases soft clipped even though they match the reference genome perfectly.
I tried variant calling using bcftools using recommended settings for SARS-Cov-2 i.e. using
--ignore-overlaps --min-ireads 10
on myISeq Illumina
data. However, I can see that a deletion supported by 30 reads is being ignored (I reran without above flags and compared the vcf files). For some reasonIDV=9
. Below attached the coverage of the deletion and the coverage plot