Illumina / hap.py

Haplotype VCF comparison tools
Other
402 stars 122 forks source link

TRUTH.TOTAL differs despite having used the same TRUTH SET #165

Open robertzeibich opened 1 year ago

robertzeibich commented 1 year ago

Do you know why the TRUTH.TOTAL differs despite having used the same TRUTH SET?

hap py output
shinlin77 commented 1 year ago

Yes, I have that same question.

opplatek commented 4 months ago

Same here. I tried: 1) Run full GiaB HG002 (AshkenazimTrio/HG002_NA24385_son/NISTv4.2.1/GRCh38) as both the truth and the sample 2) Run full GiaB HG002 (AshkenazimTrio/HG002_NA24385_son/NISTv4.2.1/GRCh38) as the truth and HG002 subset (1000 variants) as the sample.

I got a different TRUTH.TOTAL counts in the summary. However, the number of annotated variants in hap.py annotated output VCF is the same and is equal to the second test (full sample as the truth and subset as the sample).

However, it seems to apply only for vcfeval engine. With xcmp it seems to report the same numbers.