Hi @vibansal, I have a question about the calculate_haplotype_statistics.py script. I noticed that the phased count and num snps max blk reported by the script are different from those in BLOCK headers of my .hap file I use. For instance, if I sum the total number of phased SNVs and check the number of SNVs in the largest block in .hap file, I get slightly different counts as compared to the script output.
If I sum the phased field for all blocks I get the following number: 189701. My largest block header is as following:
Hi @vibansal, I have a question about the
calculate_haplotype_statistics.py
script. I noticed that thephased count
andnum snps max blk
reported by the script are different from those in BLOCK headers of my .hap file I use. For instance, if I sum the total number of phased SNVs and check the number of SNVs in the largest block in .hap file, I get slightly different counts as compared to the script output.If I sum the
phased
field for all blocks I get the following number: 189701. My largest block header is as following:However, the output from
calculate_haplotype_statistics.py
gives the following numbers with-i
on:I wonder if there is some kind of filter implemented in the script that causes this?
Best, Mikhail