Open nreid opened 4 years ago
The program assumes that all samples come from the same population. To enforce a population structure (or, in the extreme case, to treat all samples individually), check the bcftools call --group-samples
option. Note this requires that the mpileup step is run with bcftools mpileup -a FMT/AD
option.
Hello, I'm working with a mixture of low (5-10x) and very low (0.5x) coverage WGS sequences. I've got about 900 of them from a diverse set of populations, and variants are discovered every 10-20bp. I noticed that bcftools (1.9) is calling many heterozygous genotypes with no evidence for both alleles in low coverage samples:
format: GT:PL:DP:AD. genotype: 0/1:0,3,37:1:1,0
In this case there's a single reference read, but the genotype is called as heterozygous, even though the most likely genotype according to the PL is 0/0. I'm guessing this is due to some kind of prior on allele frequencies, but I don't see any options to turn that off.
I'm doing some basic population genetics, and I think this is really mucking things up. Is there any way to deal with this? I think using maximum likelihood genotypes without any kind of population prior would probably work far better in my situation.
I have used freebayes for analyses of this kind of data in the past and there HWE priors can be turned off, but this time around I'm working with ~900 samples with tons of genetic diversity and it has slowed to a crawl. bcftools runs quickly though, so I was hoping it would be suitable.
My commands are:
Thanks,