A Nextflow pipeline for NGS variant calling on SARS-CoV-2. From FASTQ files to normalized and annotated VCF files from GATK, BCFtools, LoFreq and iVar.
When running the pipeline with the default references, phasing was skipped as default GFF was not properly initialized. Using a custom GFF was working
As reported in #36 when a couple of overlapping indels are phased together this results in a wrong mutation representation. The reported case had two overlapping deletions, one with a relatively high VAF (0.6) and a second one with a low VAF, these two should never be phased in the first place. Now when the annotation vafator_af is available it only phases mutations with FILTER=PASS and with VAF >= 0.8, as opposed to before where only FILTER=PASS was taken into account. Nevertheless, in case such a situation arises two indels overlapping the same amino acid are never merged. A warning message is written.
Also, took care of several small things:
FASTA dervied from VCFs is now in the output
Python unit tests failures are now captured in CI #34
VCF normalization runs before bcftools consensus #33
A couples of issues appeared with phasing.
vafator_af
is available it only phases mutations with FILTER=PASS and with VAF >= 0.8, as opposed to before where only FILTER=PASS was taken into account. Nevertheless, in case such a situation arises two indels overlapping the same amino acid are never merged. A warning message is written.Also, took care of several small things:
bcftools consensus
#33