Closed TonyLupara closed 2 years ago
@TonyLupara,
Sorry for the late reply and thanks for bringing this to our attention.
Yes, it's possible this is an incorrect call as we've noticed some rare inconsistencies in calling the rs2279343 (41515263~A>G) SNP. The other indication is that the QUAL score is below 200.
By design, Graphtyper (the variant caller StellarPGx uses) realigns reads to a pangenome structure that contains common variants and/or variants of interest (for StellarPGx, all core variants are included to aid in their genotype assignment). This is the reason why the record in the VCF might be inconsistent/different from the observation in your IGV screenshot that's based on alignment to a 'linear' reference genome.
It is possible that if an individual has variations in CYP2B7, some CYP2B7 reads might misalign to CYP2B6 during the realignment process. Another explanation might be that the individual has a novel CYP2B6 suballele which is altering alignment penalties involving CYP2B6 and CYP2B7.
One way around this would be to increase the QUAL threshold to ≥200 to filter out such erroneous variant calls. However, we've also seen some cases with validated variants having QUAL scores below 100.
@twesigomwedavid,
Thank you for detailed response. Sorry for late reply. I have figured out this problem with graphtyper option --bamshrink_max_fraglen = 100. I don't know how it really works, but it leaves much more reads for downstream analysis.
But now I have another problem. I've tried to set program to analyse only exones, and it works well until now. I have returned all setting to default, but StellarPGx call cyp2d6 very strange. It started around 10 days ago. For example for NA21781, where cyp2d6 should be 2x2/68+4 (CN 4), it return 2/*4x2 (CN 3). Reinstall didn't help. I have an idea that it can be due to BQSR step. I will test it.
@TonyLupara,
Thanks. Kindly note that StellarPGx is not set up to analyse exome data as of now. All the thresholds for structural variant detection and star allele assignment assume that you're using high-coverage WGS data. We strongly advise against using StellarPGx for exome data without testing/validation on our end, especially for CYP2D6.
Describe the bug On 34x WGS the call for CYP2B6 is 1/4, but the right call is 1/1.
Initially computed CN = 2 Sample core variants: 41515263~A>G~0/1 Candidate alleles: ['1.v1_4.v1'] Result: 1/4
To Reproduce NA21781, WGS. Reference is taken from GeT-RM consensus and also was inspected manually in IGV.
Expected behavior There is no ALT snps on core SNP 41515263~A>G.
Output core variant for this call:
19 41515263 19:41515263:SG A G 181 PASS ABHet=0.4815;ABHom=-1;AC=1;AF=0.5;AN=2;CR=0;LOGF=3.308e-10;MQ=56;MaxAAS=13;MaxAASR=0.4815;NHet=1;NHomAlt=0;NHomRef=0;PASS_AC=1;PASS_AN=2;PASS_ratio=1;QD=18.1;RefLen=1;SB=0.6296;SBAlt=0.6923;SBF=8,9;SBF1=4,5;SBF2=4,4;SBR=6,4;SBR1=3,4;SBR2=3,0;SeqDepth=27;VarType=SG GT:AD:MD:DP:GQ:PL 0/1:14,13:0:27:99:181,0,217
There is 43 reads on that position and all have REF (A)
Screenshots