Closed teng-gao closed 3 years ago
Hi, sorry for the delay in the answer. This is a very interesting question. Indeed genotype callers can be confused by this setting as they might confuse allelic imbalance with genotyping errors (imputation should alleviate this problem, but imputation of HETs is always challenging). Unfortunately I think that proper modelling would be needed in this situation to account this properly, but I have not knowledge of methods that can easily take this into account. What is the coverage of your samples?
Thanks for the reply. The coverage of my samples are about 20x, but ideally the genotype confidence should reflect different coverage depths. The error model used by BCFtools seems to be too conservative. Is it possible to supply our own genotype probabilities (e.g. 0.8 het, 0.2 homozygous) to Glimpse?
It would be possible, yes. In that case you would need to provide an input VCF file with your genotype likelihoods in the FORMAT/PL field (remember they are Normalized Phred-scaled) or in the FORMAT/GL (and use the --inputGL option in glimpse phase (v1.1.1))
From reading the GLIMPSE code I was under the impression that GLIMPSE discards GTs and only uses GLs, if that is truly the case although the GT call may be '0/0' the GL difference between 0/1 and 0/0 should be low given the non-negligible amount of evidence for the ALT (or REF for 1/1).
From reading the GLIMPSE code I was under the impression that GLIMPSE discards GTs and only uses GLs
This is correct. The GT field is not read in the input target file. Only FORMAT/PL is used (or FORMAT/GL using the --inputGL option)
Hi authors,
I am applying your method in cancer sequencing samples, where large regions of of the chromosome may be affected by sub-clonal loss of heterozygosity (LOH). This results in allelic imbalance of heterozygous SNPs, whose allelic ratio (AR=AD/DP) deviates from the expected ratio (AR=0.5 in the below plot). Glimpse (maybe the bcftools genotyping step) seems to mistake heterozygous SNPs for homozygous when the AR deviates too much from 0.5 (for example, in the <0.2 and >0.8 range, as shown by the purple and orange dots below). Is there a way to avoid this issue, for example by tuning parameters for the genotype likelihood step?