tseemann / snippy

:scissors: :zap: Rapid haploid variant calling and core genome alignment
GNU General Public License v2.0
476 stars 115 forks source link

more SNPs with higher mincov #306

Closed Jgruetzke closed 5 years ago

Jgruetzke commented 5 years ago

Hello,

I tested different mincoverages (from 1 to 9) for two samples. Bizarrely, I detected more SNPs with higher coverages for the same input data and settings except for --mincov parameter. Is there an explaination for this? Because I would expect less SNPs with higher coverages, but I see a higher number of SNPs at mincoverage of 3 and 4 than with mincoverage 1. Here is what I got:

  Mincov 1 Mincov 2 Mincov 3 Mincov 4 Mincov 5 Mincov 7 Mincov 9
sample1 159 177 179 177 148 68 17
sample2 159 183 190 190 163 78 26

Thank you for your explanaition!

tseemann commented 5 years ago

I think --mincov 1 is a very unusual edge case, and is probably a bug related to rounding to integers (i assume --minfrac 0.9?) or weird behavious in freebayes.

If you want to use very low coverages, I don't think Snippy is the right tool. Best to just use bcftools or varscan and not use a statistical model like that in freebayes.

Example: http://thegenomefactory.blogspot.com/2018/10/a-unix-one-liner-to-call-bacterial.html