Illumina / hap.py

Haplotype VCF comparison tools
Other
406 stars 125 forks source link

Query.FP does not equal het +homalt or BD=FP #133

Open sr-bentley opened 3 years ago

sr-bentley commented 3 years ago

Hi all,

This may be a silly question, but could not find the reasoning here or in the bench marking paper.

I've run hap.py with high confidence regions, using: singularity exec --contain --bind /tmp:/tmp,${truth_path}:/data,${samplepath}:/sample_data,${ref_path}:/ref,${output_path}:/output docker://pkrusche/hap.py:v0.3.9 /opt/hap.py/bin/hap.py /data/HG001_GRCh38.vcf.gz /sample_data/${samplename} -f /data/HG001_GRCh38_highconf.bed -r /ref/genome.fa --target-regions /data/HG001_GRCh38_highconf.bed -o /output/happy/${prefix} --write-vcf --write-counts --gender=female --verbose

The output in ${prefix}.extended.csv for the relevant columns is

Type Subtype Subset Filter FP.gt FP.al QUERY.FP QUERY.FP.het QUERY.FP.homalt
SNP * TS_contained ALL 2871 647 11139 9840 1295

Initially, Why does all FP's: "QUERY.FP" not equate to QUERY.FP.het + QUERY.FP.homalt. Are there other FP's in the mix? I've check hetalt in the vcf, but there was zero.

Then when assessing FPs in the VCF file using the following command:

bcftools view -H -i 'FMT/BD="FP" & FMT/BVT="SNP" & INFO/Regions="CONF,TS_contained"' -s QUERY ${prefix}.vcf.gz | wc -l

I get 11168 - now I have more unaccounted FP's.

Thanks for your help