Illumina / hap.py

Haplotype VCF comparison tools
Other
402 stars 122 forks source link

Incorrect number of FORMAT/AD values on scmp-distance engine #181

Open jerrywongzy opened 10 months ago

jerrywongzy commented 10 months ago

Hi there, I am running into an issue while trying to run hap.py using the scmp-distance engine. The relevant output is below:

2023-09-05 06:43:47,418 INFO     bcftools index -f -t /tmp/query.ppYfi9iI.vcf.gz
2023-09-05 06:43:50,097 INFO     preprocess for /tmp/query.ppYfi9iI.vcf.gz -- time taken 479.19
2023-09-05 06:43:50,098 INFO     bcftools merge --force-samples /tmp/truth.pp5XHAG2.vcf.gz /tmp/query.ppYfi9iI.vcf.gz -o /tmp/tmp45liHG
2023-09-05 06:43:50,289 ERROR    Exception when running scmp: Command line bcftools merge --force-samples /tmp/truth.pp5XHAG2.vcf.gz /tmp/query.ppYfi9iI.vcf.gz -o /tmp/tmp45liHG got return code 255.
STDOUT:
STDERR: Incorrect number of FORMAT/AD values at chr1:13613, cannot merge. The tag is defined as Number=A, but found
2 values and 2 alleles. See also http://samtools.github.io/bcftools/howtos/FAQ.html#incorrect-nfields

2023-09-05 06:43:50,289 ERROR    ------------------------------------------------------------
2023-09-05 06:43:50,289 ERROR    Traceback (most recent call last):
2023-09-05 06:43:50,289 ERROR      File "/home/ubuntu/.conda/envs/happy/lib/python27/Haplo/scmp.py", line 45, in runSCmp
2023-09-05 06:43:50,289 ERROR        runBcftools(*vargs)
2023-09-05 06:43:50,289 ERROR      File "/home/ubuntu/.conda/envs/happy/lib/python27/Tools/bcftools.py", line 64, in runBcftools
2023-09-05 06:43:50,289 ERROR        return runShellCommand('bcftools', *args)
2023-09-05 06:43:50,290 ERROR      File "/home/ubuntu/.conda/envs/happy/lib/python27/Tools/bcftools.py", line 52, in runShellCommand
2023-09-05 06:43:50,290 ERROR        raise Exception("Command line {} got return code {}.\nSTDOUT: {}\nSTDERR: {}".format(cmd_line, return_code, stdout, stderr))
2023-09-05 06:43:50,290 ERROR    Exception: Command line bcftools merge --force-samples /tmp/truth.pp5XHAG2.vcf.gz /tmp/query.ppYfi9iI.vcf.gz -o /tmp/tmp45liHG got return code 255.STDOUT: STDERR: Incorrect number of FORMAT/AD values at chr1:13613, cannot merge. The tag is defined as Number=A, but found2 values and 2 alleles. See also http://samtools.github.io/bcftools/howtos/FAQ.html#incorrect-nfields
2023-09-05 06:43:50,290 ERROR    ------------------------------------------------------------
2023-09-05 06:43:50,332 ERROR    Command line bcftools merge --force-samples /tmp/truth.pp5XHAG2.vcf.gz /tmp/query.ppYfi9iI.vcf.gz -o /tmp/tmp45liHG got return code 255.
STDOUT:
STDERR: Incorrect number of FORMAT/AD values at chr1:13613, cannot merge. The tag is defined as Number=A, but found
2 values and 2 alleles. See also http://samtools.github.io/bcftools/howtos/FAQ.html#incorrect-nfields

2023-09-05 06:43:50,332 ERROR    Traceback (most recent call last):
2023-09-05 06:43:50,332 ERROR      File "/home/ubuntu/.conda/envs/happy/bin/hap.py", line 540, in <module>
2023-09-05 06:43:50,332 ERROR        main()
2023-09-05 06:43:50,332 ERROR      File "/home/ubuntu/.conda/envs/happy/bin/hap.py", line 488, in main
2023-09-05 06:43:50,332 ERROR        tempfiles += Haplo.scmp.runSCmp(args.vcf1, args.vcf2, output_name, args)
2023-09-05 06:43:50,332 ERROR      File "/home/ubuntu/.conda/envs/happy/lib/python27/Haplo/scmp.py", line 45, in runSCmp
2023-09-05 06:43:50,332 ERROR        runBcftools(*vargs)
2023-09-05 06:43:50,333 ERROR      File "/home/ubuntu/.conda/envs/happy/lib/python27/Tools/bcftools.py", line 64, in runBcftools
2023-09-05 06:43:50,333 ERROR        return runShellCommand('bcftools', *args)
2023-09-05 06:43:50,333 ERROR      File "/home/ubuntu/.conda/envs/happy/lib/python27/Tools/bcftools.py", line 52, in runShellCommand
2023-09-05 06:43:50,333 ERROR        raise Exception("Command line {} got return code {}.\nSTDOUT: {}\nSTDERR: {}".format(cmd_line, return_code, stdout, stderr))
2023-09-05 06:43:50,333 ERROR    Exception: Command line bcftools merge --force-samples /tmp/truth.pp5XHAG2.vcf.gz /tmp/query.ppYfi9iI.vcf.gz -o /tmp/tmp45liHG got return code 255.STDOUT: STDERR: Incorrect number of FORMAT/AD values at chr1:13613, cannot merge. The tag is defined as Number=A, but found2 values and 2 alleles. See also http://samtools.github.io/bcftools/howtos/FAQ.html#incorrect-nfields

I have tried removing the AD and GQ fields for both my input VCF and truth VCF. I have also tried both "Number=R" and "Number=." in the AD field. However none of those steps are working and it seems like its the intermediate files that need to be processed. Does anyone know any workarounds for this?