DecodeGenetics / graphtyper

Population-scale genotyping using pangenome graphs
http://dx.doi.org/10.1038/ng.3964
MIT License
167 stars 20 forks source link

FILTER=PASS but FT=FAIL1 #64

Open jjfarrell opened 3 years ago

jjfarrell commented 3 years ago

Why is the filter a PASS for these 3 variants but the FT=FAIL1. Also what is the best software to compare these calls with the GIAB benchmark? The Ref All is very different than those found in the GIAB vcf. Is there some software available to make that comparison?

chr10   185506  chr10:185506:DG.2       N       <DEL:SVSIZE=57:AGGREGATED>      24      PASS    ABHet=0.325;ABHom=-1;AC=1;AF=0.5;AN=2;CR=0;END=185580;LOGF=2.214e-12;MaxAAS=13;MaxAASR=0.325;MaxAltPP=0;NHet=1;NHomAlt=0;NHomRef=0;PASS_AC=0;PASS_AN=0;PASS_ratio=0;QD=2.4;RefLen=1;SB=1;SBAlt=-1;SBF=4,0
;SBF1=2,0;SBF2=2,0;SBR=0,0;SBR1=0,0;SBR2=0,0;SEQ=AGCACTTTGGGAGGCTG;SVLEN=57;SVMODEL=AGGREGATED;SVSIZE=57;SVTYPE=DEL;SV_ID=140;SeqDepth=40;VarType=DG    GT:FT:AD:MD:DP:RA:PP:GQ:PL      0/1:FAIL1:27,13:0:40:0,0:0:24:24,0,136
chr10   185506  chr10:185506:DG.5       N       <DEL:SVSIZE=74:AGGREGATED>      24      PASS    ABHet=0.325;ABHom=-1;AC=1;AF=0.5;AN=2;CR=0;END=185580;LOGF=2.214e-12;MaxAAS=13;MaxAASR=0.325;MaxAltPP=0;NHet=1;NHomAlt=0;NHomRef=0;PASS_AC=0;PASS_AN=0;PASS_ratio=0;QD=2.4;RefLen=1;SB=1;SBAlt=-1;SBF=4,0
;SBF1=2,0;SBF2=2,0;SBR=0,0;SBR1=0,0;SBR2=0,0;SVLEN=74;SVMODEL=AGGREGATED;SVSIZE=74;SVTYPE=DEL;SV_ID=139;SeqDepth=40;VarType=DG  GT:FT:AD:MD:DP:RA:PP:GQ:PL      0/1:FAIL1:27,13:0:40:0,0:0:24:24,0,136
chr10   264506  chr10:264506:DG N       <DEL:SVSIZE=118:AGGREGATED>     33      PASS    ABHet=0.3171;ABHom=-1;AC=1;AF=0.5;AN=2;CR=0;END=264624;LOGF=3.776e-12;MaxAAS=13;MaxAASR=0.3171;MaxAltPP=0;NHet=1;NHomAlt=0;NHomRef=0;PASS_AC=0;PASS_AN=0;PASS_ratio=0;QD=3.3;RefLen=1;SB=0.2143;SBAlt=0.5;SBF=2,1
;SBF1=0,1;SBF2=2,0;SBR=10,1;SBR1=6,0;SBR2=4,1;SVLEN=118;SVMODEL=AGGREGATED;SVSIZE=118;SVTYPE=DEL;SV_ID=187;SeqDepth=41;VarType=DG       GT:FT:AD:MD:DP:RA:PP:GQ:PL      0/1:FAIL1:28,13:0:41:0,0:0:33:33,0,213
hannespetur commented 3 years ago

Hi, thanks. I can see it seems odd that FILTER=PASS when no genotype call has PASS. I will add a check for this in the future. I don't know if GIAB has a recommded tool for benchmarking.

jjfarrell commented 3 years ago

@hannespetur When there is a FILTER=PASS with lets say an 95% PASSratio, how should the FT=FAIL GT be handled in an association analysis. Should they best be set to missing or kept in the analyses?

hannespetur commented 3 years ago

Hello, I would suggest keeping them, and also use the genotype PHRED genotype likelihood field (PL) in the VCF (if your association pipeline supports it).

Best, Hannes