artic-network / fieldbioinformatics

The ARTIC field bioinformatics pipeline
MIT License
110 stars 69 forks source link

artic_vcf_merge Fails #119

Closed Rohit-Satyam closed 1 year ago

Rohit-Satyam commented 1 year ago

Dear Developers,

I was trying to use artic pipeline using following command but I am getting an error at artic_vcf_merge step. Below is the command used and error thrown

artic minion --read-file results/01_read_filtering/005_barcode49.fastq.gz --scheme-directory tools/artic-ncov2019/primer_schemes --medaka --medaka-model r941_min_fast_g303 nCoV-2019/V4.1 0055219_barcode49 

And the error is

Running: artic_vcf_merge 005_barcode49 tools/artic-ncov2019/primer_schemes/nCoV-2019/V4.1/nCoV-2019.scheme.bed 2> 005_barcode49.primersitereport.txt 1:005_barcode49.1.vcf 2:005_barcode49.2.vcf
Command failed:artic_vcf_merge 005_barcode49 tools/artic-ncov2019/primer_schemes/nCoV-2019/V4.1/nCoV-2019.scheme.bed 2> 005_barcode49.primersitereport.txt 1:005_barcode49.1.vcf 2:005_barcode49.2.vcf

Version

artic 1.2.1
medaka 1.0.3

I also tried running artic_vcf_merge independently and I am getting the following error

artic_vcf_merge 005_barcode49 tools/artic-ncov2019/primer_schemes/nCoV-2019/V4.1/nCoV-2019.scheme.bed 1:005_barcode49.1.vcf 2:005_barcode49.2.vcf 

Traceback (most recent call last):
  File "/home/subudhak/miniconda3/envs/artic/bin/artic_vcf_merge", line 10, in <module>
    sys.exit(main())
  File "/home/subudhak/miniconda3/envs/artic/lib/python3.6/site-packages/artic/vcf_merge.py", line 55, in main
    vcf_merge(args)
  File "/home/subudhak/miniconda3/envs/artic/lib/python3.6/site-packages/artic/vcf_merge.py", line 26, in vcf_merge
    vcf_reader.infos["Pool"] = vcf.parser._Format("Pool", 1, "String", "The pool name")
TypeError: __new__() missing 1 required positional argument: 'type_code'

##################################
artic_vcf_merge 005_barcode49 tools/artic-ncov2019/primer_schemes/nCoV-2019/V4.1/nCoV-2019.scheme.bed 005_barcode49.1.vcf 005_barcode49.2.vcf
Traceback (most recent call last):
  File "/home/subudhak/miniconda3/envs/artic/bin/artic_vcf_merge", line 10, in <module>
    sys.exit(main())
  File "/home/subudhak/miniconda3/envs/artic/lib/python3.6/site-packages/artic/vcf_merge.py", line 55, in main
    vcf_merge(args)
  File "/home/subudhak/miniconda3/envs/artic/lib/python3.6/site-packages/artic/vcf_merge.py", line 20, in vcf_merge
    pool_name, file_name = param.split(":")
ValueError: not enough values to unpack (expected 2, got 1)
BioWilko commented 1 year ago

Hi I'm struggling to figure out exactly what is going on here, would you be able to share the VCF files with me? 005_barcode49.1.vcf and 005_barcode49.2.vcf as well as 005_barcode49.primersitereport.txt

BioWilko commented 1 year ago

I managed to eventually track down this bug to this dependency error (arrows indicate dependency); artic -> medaka -> whatshap -> pyfaidx -> pyvcf3. pyvcf3 is a fork of pyvcf which is still maintained (unlike pyvcf) it uses the same name in its import hook as pyvcf import vcf which caused it to be used over the explicitly required pyvcf in some circumstances.

This went unnoticed (due to identical functionality) until the following commit https://github.com/dridk/PyVCF3/commit/ec232530978e6ec4717f3e2be833d684f94d0561 added the type_code argument to the _Format namedtuple object leading to the error above. This error also only manifested when dependencies were installed in a particular order leading to pyvcf3 being imported over pyvcf.

Version 1.2.3 fixes this by pinning a version of pyfaidx which doesn't depend on pyvcf3 at all, the conda package recipe is currently awaiting review before going live.