TRON-Bioinformatics / covigator-ngs-pipeline

A Nextflow pipeline for NGS variant calling on SARS-CoV-2. From FASTQ files to normalized and annotated VCF files from GATK, BCFtools, LoFreq and iVar.
MIT License
17 stars 7 forks source link

VCF isn't created when running pipeline #1

Closed mokrobial closed 2 years ago

mokrobial commented 2 years ago

Running covigator-ngs-pipeline with 2 fastq files returns error message and does not create vcf file. It does create these 5 files as expected: 4014963632_S9.coverage.tsv 4014963632_S9.deduplication_metrics.txt 4014963632_S9.depth.tsv 4014963632_S9.fastp_stats.html 4014963632_S9.fastp_stats.json

  1. all files were added to the covigator-ngs-pipeline generated by the git clone (added reference files NC_045512.fa, NC_045512.sorted.gff3, and NC_045512.dict and the indexed files that were processed with bwa and samtools previously)

Steps to reproduce the behavior. 1. $ git clone https://github.com/TRON-Bioinformatics/covigator-ngs-pipeline.git $ cd covigator-ngs-pipeline $ nextflow main.nf -profile conda --initialize $ conda info --envs $ conda activate /Users/mokrobial/covigator-ngs-pipeline/work/conda/covigator-pipeline-7bbe89a37e1b4f3ad6e1a1a7fb6a53e5

2. Re-ran commands for ref fasta just in case: $ bwa index NC_045512.fa $ samtools faidx NC_045512.fa

3. $ nextflow run tron-bioinformatics/covigator-ngs-pipeline --fastq1 4014963632_S9_L001_R1_001.fastq --fastq2 4014963632_S9_L001_R2_001.fastq --name 4014963632_S9 --output covigator-output --reference NC_045512.fa --gff NC_045512.sorted.gff3

Result: Error message - see below. The NC_045512.fa.fai file is present in the directory.

Error executing process > 'variantNormalization (4014963632_S9)' Caused by: Process variantNormalization (4014963632_S9) terminated with an error exit status (255)

Command executed: initial sort of the VCF bcftools sort 4014963632_S9.bcftools.bcf | checks reference genome, decompose multiallelics, trim and left align indels bcftools norm --multiallelics -any --check-ref e --fasta-ref NC_045512.fa --old-rec-tag OLD_CLUMPED - | decompose complex variants vt decompose_blocksub -a -p - | remove duplicates after normalisation bcftools norm --rm-dup exact -o 4014963632_S9.bcftools.normalized.vcf -

Command exit status: 255

Command output: (empty)

Command error: Writing to /tmp/bcftools-sort.zWnXi8 Merging 1 temporary files Cleaning Done [E::fai_build3_core] Failed to open the file NC_045512.fa Failed to load the fai index: NC_045512.fa decompose_blocksub v0.5

options: input VCF file - [o] output VCF file - [a] align/aggressive mode true [p] output phased genotypes true

[bcf_ordered_reader.cpp:71 BCFOrderedReader] Not a VCF/BCF file: - Failed to read from standard input: unknown file type

Environment:

priesgo commented 2 years ago

Hi @mokrobial, thanks for the detailed bug report, I will try to get your environment to work.

I suggest you try first using the default reference genome and annotations, that would allow us to skip steps 0, 1 and 2.

Then on step 3 you are missing the parameter -profile conda. You will need to run:

nextflow run tron-bioinformatics/covigator-ngs-pipeline -profile conda --fastq1 4014963632_S9_L001_R1_001.fastq --fastq2 4014963632_S9_L001_R2_001.fastq --name 4014963632_S9 --output covigator-output

If this works then to use a different reference you will need to store those files wherever you want no need to put in any particular folder, but then refer to them with the absolute paths.

As you can see you do not need to clone the repository, nextflow is doing that for you behind the scenes.

mokrobial commented 2 years ago

Awqsome! it works as expected :)

And just in case anyone else runs into this: I initially had an issue with the conda environment (miniconda3 vs opt/anaconda3). I was in a conda env I created under miniconda and couldn't run at all as Nextflow installed under opt (I cloned the repository as workaround).

I appreciate the troubleshooting help!

priesgo commented 2 years ago

Great! Let us know if you stumble upon anything else unexpected. Any other suggestions to improve the pipeline are welcomed.

Closing this issue now