Confused with the interpretation when having TRUTH.FN and QUERY.FP

Manuel-DominguezCBG commented 1 year ago

I have found this line in the output

21 44480616 . G A . . BS=44480616;Regions=CONF,TS_contained GT:BD:BK:QQ:BI:BVT:BLT 1|0:FN:am:.:ti:SNP:het 1/1:FP:am:564.05:ti:SNP:homalt

What this really means? Is this a mismatch because of this was called as a het in the TRUE callset and as hom in QUERY callset??

Sorry if this question is not a real issue.

Thanks!

M

giovannabloise commented 1 year ago

Hello Manuel!

Yes, when genotypes are not the same between query and truth, HAP treats them as different, hence the mismatch. I recommend you to check the BAM file (if available) and confirm which call is true. You can find more information here: https://github.com/Illumina/hap.py/blob/master/doc/happy.md#haplotype-comparison-parameters

Hap.py will report counts of:

true-positives (TP) : variants/genotypes that match in truth and query.

false-positives (FP) : variants that have mismatching genotypes or alt alleles, as well as query variant calls in regions a truth set would call confident hom-ref regions.

false-negatives (FN) : variants present in the truth set, but missed in the query.

non-assessed calls (UNK) : variants outside the truth set regions

Manuel-DominguezCBG commented 1 year ago

Thanks for you reply 👍

Illumina / hap.py

Confused with the interpretation when having TRUTH.FN and QUERY.FP #168