freebayes / freebayes

Bayesian haplotype-based genetic polymorphism discovery and genotyping.
http://arxiv.org/abs/1207.3907
MIT License
790 stars 264 forks source link

genotype calling are not consistent with RO and AO number #768

Open suncpark opened 1 year ago

suncpark commented 1 year ago

I have observed inconsistency in genotype calling within a VCF file. Although most genotyping make sense with AO/RO number, but I found many instances where an individual with similar numbers of RO and AO was genotyped as hm ref (0/0) rather than ht (0/1). In some extreme cases, where RO was much greater than AO (e.g. 800 RO, 300 AO), the genotype was called as hm alt (1/1).

I am unsure whether this inconsistency is due to computational error or if it is being called based on Bayesian probability. I would be interested to know if others have observed similar results and if there is a standard approach for handling these discrepancies in genotype calling.

IgnasiLucas commented 1 year ago

I cannot know if this is the reason of your problem, but I want to share I just realized that freebayes (version 1.3.7) interprets the absence of the base quality string in BAM files as a literal lack of quality. When I made the mistake of feeding freebayes BAM files without base quality strings, then all the sum of qualities for reference (QR) and alternate (QA) observations were set to zero. And apparently that was the reason why all genotypes and their likelihoods were wrong in my VCF. It was quite scary.