Illumina / hap.py

Haplotype VCF comparison tools
Other
405 stars 123 forks source link

CalledProcessError: With Octopus VCF #92

Open 24natasya opened 5 years ago

24natasya commented 5 years ago

Hi i seemed to have problem running the octopus vcfs file by either using legacy/ non-legacy file

I tried using --no-leftshift --no-decompose --gender=none

but it does not work cts/2019_MLVarCaller/01_BAMs/hs37d5.fa -f 0 -n 16768 --expand-hapblocks 30 --window 50 --no-hapcmp 0 --qq QUAL' returned non-zero exit status 1 2019-08-23 02:11:07,800 ERROR ------------------------------------------------------------ 2019-08-23 02:11:07,953 WARNING Filter 'q10,LBQ' is not in the VCF header. This will break VCF writing, at 11:114675927 2019-08-23 02:11:07,953 ERROR Exception when running <function xcmpWrapper at 0x7f7481530c80>: 2019-08-23 02:11:07,954 ERROR ------------------------------------------------------------ 2019-08-23 02:11:07,954 ERROR Traceback (most recent call last): 2019-08-23 02:11:07,954 ERROR File "/opt/hap.py/lib/python27/Tools/parallel.py", line 72, in parMapper 2019-08-23 02:11:07,954 ERROR return arg[1]['fun'](arg[0], *arg[1]['args'], *arg[1]['kwargs']) 2019-08-23 02:11:07,954 ERROR File "/opt/hap.py/lib/python27/Haplo/xcmp.py", line 70, in xcmpWrapper 2019-08-23 02:11:07,954 ERROR subprocess.check_call(to_run, shell=True, stdout=tfo, stderr=tfe) 2019-08-23 02:11:07,954 ERROR File "/usr/lib/python2.7/subprocess.py", line 541, in check_call 2019-08-23 02:11:07,955 ERROR raise CalledProcessError(retcode, cmd) 2019-08-23 02:11:07,955 ERROR CalledProcessError: Command 'xcmp /tmp/truth.ppt6vrQz.vcf.gz /tmp/query.pppjiYy9.vcf.gz -l 11:113424659-119192669 -o /tmp/result.11:113424659-1191926694qid1J.bcf -r /export/Projects/2019_MLVarCaller/01_BAMs/hs37d5.fa -f 0 -n 16768 --expand-hapblocks 30 --window 50 --no-hapcmp 0 --qq QUAL' returned non-zero exit status 1 2019-08-23 02:11:07,955 ERROR ------------------------------------------------------------ 2019-08-23 02:11:07,994 WARNING Filter 'q10,LBQ' is not in the VCF header. This will break VCF writing, at 11:44333284 2019-08-23 02:11:07,994 ERROR Exception when running <function xcmpWrapper at 0x7f7481530c80>: 2019-08-23 02:11:07,994 ERROR ------------------------------------------------------------ 2019-08-23 02:11:07,994 ERROR Traceback (most recent call last): 2019-08-23 02:11:07,994 ERROR File "/opt/hap.py/lib/python27/Tools/parallel.py", line 72, in parMapper 2019-08-23 02:11:07,995 ERROR return arg[1]['fun'](arg[0], arg[1]['args'], **arg[1]['kwargs']) 2019-08-23 02:11:07,995 ERROR File "/opt/hap.py/lib/python27/Haplo/xcmp.py", line 70, in xcmpWrapper 2019-08-23 02:11:07,995 ERROR subprocess.check_call(to_run, shell=True, stdout=tfo, stderr=tfe) 2019-08-23 02:11:07,995 ERROR File "/usr/lib/python2.7/subprocess.py", line 541, in check_call 2019-08-23 02:11:07,995 ERROR raise CalledProcessError(retcode, cmd) 2019-08-23 02:11:07,995 ERROR CalledProcessError: Command 'xcmp /tmp/truth.ppt6vrQz.vcf.gz /tmp/query.pppjiYy9.vcf.gz -l 11:44171557-54821230 -o /tmp/result.11:44171557-54821230Z5XuBO.bcf -r /export/Projects/

DBS-Max commented 4 years ago

same issue

Lenbok commented 4 years ago

@DBS-Max The error message looks to be complaining about filter value "q10,LBQ". Multiple filter values should be separated by semicolon, not comma. Check whether this malformed filter value is in your original input VCF (and if so, you can probably fix it with sed), or whether hap.py has introduced the problem itself.

oiiio commented 4 years ago

same issue, I don't see improper formatting in the FILTER column

Lenbok commented 4 years ago

If you attach or link to a small VCF that can reproduce the problem, there is some hope it will get fixed.

gkaur commented 4 years ago

Had faced a similar issue before. For me, it was occurring due to: a) extra alleles reported in the ALT field b) a few times 'AD' field description in the header was not 'Number=R', also causing the issue. c) lastly, fields used in the vcf file with no description in the header. What was really useful for me while debugging was the '--verbose' option in hap.py. I could identify the exact command leading to the error (very helpful).

For now, before giving the files to hap.py I have started running the truth and query vcf files through these commands below. No issues since :)

zcat vcf.gz | \ sed '~s/AD,Number=./AD,Number=R/g' | \ bcftools view -c 1 -g ^miss -e'GT="0/0"' --trim-alt-alleles | \ bgzip -c > edited_vcf.gz

The first part with 'sed' is trying to fix the AD description in the header, if broken. The second part with 'bcftools' removes lines with reference (0/0) genotype. Also 'trim-alt-alleles' optiom remove alleles not seen in the genotype fields from the ALT column.

Hope this helps!

fsandron commented 3 years ago

Hello, I also had the same issue with octopus 0.7.4. For me, it is not an issue coming from the traditional FILTER field but from the sample level filter FT in the FORMAT field. As a workaround, I removed it with bcftools annotate before using hap.py.

bcftools annotate -x FMT/FT -o OC_noFT.vcf OC.vcf