artic-network / fieldbioinformatics

The ARTIC field bioinformatics pipeline
MIT License
110 stars 69 forks source link

Custom bed file with "primer gaps" causes workflow to crash at artic-tools vcf_checker step #131

Closed DABAKER165 closed 7 months ago

DABAKER165 commented 7 months ago

We are using custom primers with a custom bed file. The primers create amplicons that do not all overlap with a gap, even though the alternate primer will be spaced out enough to have an overlap. The artic-tools vcf_checker detects a gap and throws and error and does not continue. This creates a downstream effect of not allowing a consensus to be made and masks every thing.

Bedfile lines with a "gap" MN908947.3 3367 3397 SARS-CoV-2_24_LEFT 2 + ACTGACAATGTATACATTAAAAATGCAGAC MN908947.3 3399 3425 SARS-CoV-2_23_RIGHT 1 - TGTGGAAGAAGCTAAAAAGGTAAAAC MN908947.3 3390 3412 SARS-CoV-2_23_RIGHT_alt1 1 - TGCAGACATTGTGGAAGAAGCT

The artic-tools vcf_checker throws an error in the 22.vcfcheck.log: [15:10:12] [artic-tools::check_vcf] starting VCF checker [15:10:12] [artic-tools::check_vcf] reading scheme error--> gap found in primer scheme - 3390-3397

Here is the error log of the entire workflow: Running: artic-tools check_vcf --summaryOut 20.vcfreport.txt 20.merged.vcf.gz ./SARS-CoV-2/qia/v1/SARS-CoV-2.scheme.bed 2> 20.vcfcheck.log Command failed:artic-tools check_vcf --summaryOut 20.vcfreport.txt 20.merged.vcf.gz ./SARS-CoV-2/qia/v1/SARS-CoV-2.scheme.bed 2> 20.vcfcheck.log Mapped/Unmapped/Short/Masked/Skipped(all matches masked): 40307/0/0/0/0 [23:00:11 - root] Processing region MN908947.3:0-29900 Mapped/Unmapped/Short/Masked/Skipped(all matches masked): 41086/0/0/0/0 [23:00:17 - root] Processing region MN908947.3:0-29900

What is a course of action to skip this step or get this step to not thrown an error as bed files are allowed to have gaps it them.

BioWilko commented 7 months ago

check_vcf is only run if the --strict flag is set when running artic minion, if you don't set that flag the bed file gap will be ignored.