Open dianacornejo opened 1 year ago
I think there are two primary consequences:
As for the GQ issue with homozygous alt calls – I have no idea! I assume it is something to do with how the GQ field was calibrated when running DeepVariant, but I am unsure. I posted the issue originally on the old UKBB forums where it was confirmed, but as far as I am aware, nothing was ever done about it. In general, homozygous alt calls are of high quality due to how the genotyper works – the majority of issues are for heterozygous genotypes.
In general I think you can follow the filtering approaches outlined in this applet and have relatively high-quality data.
@eugenegardner thanks for the response! I'm doing a filtering using a very similar approach on what you did. Also a follow up question. Do you know if the AAscore is only available for the WGS or if there's something similar for the WES? The authors mention it in this paper
AAscore is just a variant quality value similar to something like VQSR from GATK. The team that generated the WES data used Google DeepVariant according to their best practices. I am unfamiliar with exactly how filtering is done by DeepVariant (and what quality scores it does / does not generate), but at any rate the only score included in the pVCFs released to users of UKBiobank is AQ. I am not familiar with how AQ is calculated.
Hi @eugenegardner just wondering what the consequences of not applying any filter for the ALT/ALT genotypes would be. Very likely including false positive calls? Also do you have any idea why is this difference in the GQ depending on the genotype? and has this been discussed by the UKB people? Just wondering what the best approach to do this is?
Thank you
Diana