FrickTobias / BLR

MIT License
6 stars 5 forks source link

calculate_haplotype_statistics fails with HapCUT2 1.3.2 #223

Closed pontushojer closed 4 years ago

pontushojer commented 4 years ago

Error from tool:

SETTINGS FOR: calculate_haplotype_statistics (version: 0.1.2.dev58+gbce7b21.d20200723)
 vcf1: final.phased.vcf
 vcf2: ground_truth.phased.vcf
 indels: False
 output: final.phasing_stats.txt
calculate_haplotype_statistics - INFO: Starting analysis
Traceback (most recent call last):
  File "/Users/pontus.hojer/miniconda3/envs/blr-latest/bin/blr", line 33, in <module>
    sys.exit(load_entry_point('blr', 'console_scripts', 'blr')())
  File "/Users/pontus.hojer/projects/BLR-private/src/blr/__main__.py", line 45, in main
    module.main(args)
  File "/Users/pontus.hojer/projects/BLR-private/src/blr/cli/calculate_haplotype_statistics.py", line 22, in main
    stats = vcf_vcf_error_rate(args.vcf1, args.vcf2, args.indels)
  File "/Users/pontus.hojer/projects/BLR-private/src/blr/cli/calculate_haplotype_statistics.py", line 402, in vcf_vcf_error_rate
    chrom_a_blocklist = parse_vcf_phase(assembled_vcf_file, indels)
  File "/Users/pontus.hojer/projects/BLR-private/src/blr/cli/calculate_haplotype_statistics.py", line 61, in parse_vcf_phase
    assert (PS_index == i)

This is due to an update in the phased VCF format outputted by HapCUT2 where non-phased entries also have the PS tag.

HapCUT2 < 1.3.2

chrA    458 .   T   C   50  PASS    ... GT:DP:ADALL:AD:GQ   1/1:1052:0,347:28,510:99

HapCUT2 = 1.3.2

chrA    458 .   T   C   50  PASS    ... GT:DP:ADALL:AD:GQ:PS    1/1:1052:0,347:28,510:99:.