nf-core / rnavar

gatk4 RNA variant calling pipeline
https://nf-co.re/rnavar
MIT License
37 stars 32 forks source link

ENSEMBLVEP fails after module update #84

Open bounlu opened 1 year ago

bounlu commented 1 year ago

Description of the bug

After the recent update of the module bea3ca9, ENSEMBLVEP process fails:

-[nf-core/rnavar] Pipeline completed with errors-
Error executing process > 'NFCORE_RNAVAR:RNAVAR:ANNOTATE:MERGE_ANNOTATE:ENSEMBLVEP (sample1)'

Caused by:
  Essential container in task exited

Command executed:

  mkdir sample1

  vep \
      -i sample1.gz \
      -o sample1.ann.vcf \
      --everything --filter_common --per_gene --total_length --offline \
      --assembly GRCh38 \
      --species homo_sapiens \
      --cache \
      --cache_version '108' \
      --dir_cache /.vep \
      --fork 6 \
      --vcf \
      --stats_file sample1.summary.html

  rm -rf sample1

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNAVAR:RNAVAR:ANNOTATE:MERGE_ANNOTATE:ENSEMBLVEP":
      ensemblvep: $( echo $(vep --help 2>&1) | sed 's/^.*Versions:.*ensembl-vep : //;s/ .*$//')
  END_VERSIONS

Command exit status:
  2

Command output:
  (empty)

Command error:
  -------------------- EXCEPTION --------------------
  MSG: ERROR: Cache directory /.vep/homo_sapiens not found
  STACK Bio::EnsEMBL::VEP::CacheDir::dir /usr/local/share/ensembl-vep-108.2-0/modules/Bio/EnsEMBL/VEP/CacheDir.pm:305
  STACK Bio::EnsEMBL::VEP::CacheDir::init /usr/local/share/ensembl-vep-108.2-0/modules/Bio/EnsEMBL/VEP/CacheDir.pm:219
  STACK Bio::EnsEMBL::VEP::CacheDir::new /usr/local/share/ensembl-vep-108.2-0/modules/Bio/EnsEMBL/VEP/CacheDir.pm:111
  STACK Bio::EnsEMBL::VEP::AnnotationSourceAdaptor::get_all_from_cache /usr/local/share/ensembl-vep-108.2-0/modules/Bio/EnsEMBL/VEP/AnnotationSourceAdaptor.pm:116
  STACK Bio::EnsEMBL::VEP::AnnotationSourceAdaptor::get_all /usr/local/share/ensembl-vep-108.2-0/modules/Bio/EnsEMBL/VEP/AnnotationSourceAdaptor.pm:92
  STACK Bio::EnsEMBL::VEP::BaseRunner::get_all_AnnotationSources /usr/local/share/ensembl-vep-108.2-0/modules/Bio/EnsEMBL/VEP/BaseRunner.pm:170
  STACK Bio::EnsEMBL::VEP::Runner::init /usr/local/share/ensembl-vep-108.2-0/modules/Bio/EnsEMBL/VEP/Runner.pm:128
  STACK Bio::EnsEMBL::VEP::Runner::run /usr/local/share/ensembl-vep-108.2-0/modules/Bio/EnsEMBL/VEP/Runner.pm:199
  STACK toplevel /usr/local/bin/vep:240
  Date (localtime)    = Wed Feb  8 02:58:15 2023
  Ensembl API version = 108
  ---------------------------------------------------

Work dir:
  s3://omeran/nextflow/rnavar/work/3c/49923e4312cb7325c865ed2d9c9218

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

Command used and terminal output

#!/bin/bash

nextflow run nf-core/rnavar \
-latest \
-profile docker \
--genome GRCh38 \
--read_length 150 \
--fasta 's3://reference/rnaseq-genome/combined.fa' \
--gtf 's3://reference/rnaseq-genome/combined.gtf' \
--star_index 's3://reference/rnaseq-genome/STAR_100/' \
--dbsnp 's3://ngi-igenomes/igenomes/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/dbsnp_146.hg38.vcf.gz' \
--dbsnp_tbi 's3://ngi-igenomes/igenomes/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/dbsnp_146.hg38.vcf.gz.tbi' \
--known_indels 's3://ngi-igenomes/igenomes/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz' \
--known_indels_tbi 's3://ngi-igenomes/igenomes/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz.tbi' \
--annotate_tools merge \
--snpeff_db 'GRCh38.99' \
--vep_species 'homo_sapiens' \
--vep_genome 'GRCh38' \
--vep_cache_version \'108\' \
--max_cpus 512 \
--max_memory '2048.GB' \
--input 'samplesheet_rnaseq.csv' \
--outdir 's3://nextflow/rnavar/results/' \
-bucket-dir 's3://nextflow/rnavar/work/' \
-work-dir 's3://nextflow/rnavar/work/' \
-c 'custom.config' \
-r dev \
-resume

Relevant files

No response

System information

  N E X T F L O W
  version 22.10.6 build 5843
  created 23-01-2023 23:20 UTC (24-01-2023 07:20 SGST)
  cite doi:10.1038/nbt.3820
  http://nextflow.io
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.1 LTS
Release:    22.04
Codename:   jammy
maxulysse commented 1 year ago

Hi @bounlu, This pipeline is due for some refactoring. I'm updating the ENSEMBLVEP/SNPEFF modules and Subworkflows in Sarek. Once it's done, I'll refactor that part here as well.

bounlu commented 1 year ago

Hi @maxulysse

Yes, updating the version number from 104.3 to 108.2 here or in a custom config file fixed it:

        container   = { params.genome ? "nfcore/vep:104.3.${params.genome}" : "nfcore/vep:104.3.${params.vep_genome}" }

I believe the cache version should not be hardcoded.