bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
991 stars 354 forks source link

Error running VEP #3540

Closed DolapoA closed 3 years ago

DolapoA commented 3 years ago

Version info

To Reproduce Exact bcbio command you have used:

bcbio_nextgen.py ../config/vep_test.yaml -n 4

Your sample configuration file:

details:
- algorithm:
    aligner: false
    background: /SAN/colcc/sarcoma_bloodcancer_raw/archive/Vortex/10_Vortex_trial/Targeted_sequencing/Fastq/combined/config/panel_of_normals/pon.vcf.gz
    effects: vep
    ensemble:
      numpass: 2
      use_filtered: true
    exclude_regions:
    - lcr
    - polyx
    - altcontigs
    - highdepth
    min_allele_fraction: 15
    tools_on:
    - damage_filter
    variant_regions: /SAN/colcc/sarcoma_bloodcancer_raw/archive/Vortex/10_Vortex_trial/Targeted_sequencing/Fastq/vep_test/config/Vortex_3167021_Covered.bed
    variantcaller: mutect2
  analysis: variant2
  description: VOR100_1_FFPE_S39
  files:
  - /SAN/colcc/sarcoma_bloodcancer_raw/archive/Vortex/10_Vortex_trial/Targeted_sequencing/Fastq/vep_test/input/bams/VOR100_1_FFPE_S39-sort-recal.bam
  genome_build: hg19
  metadata:
    batch: VOR100
    phenotype: tumor
    samplename: VOR100_1
- algorithm:
    aligner: false
    background: /SAN/colcc/sarcoma_bloodcancer_raw/archive/Vortex/10_Vortex_trial/Targeted_sequencing/Fastq/combined/config/panel_of_normals/pon.vcf.gz
    effects: vep
    ensemble:
      numpass: 2
      use_filtered: true
    exclude_regions:
    - lcr
    - polyx
    - altcontigs
    - highdepth
    min_allele_fraction: 15
    tools_on:
    - damage_filter
    variant_regions: /SAN/colcc/sarcoma_bloodcancer_raw/archive/Vortex/10_Vortex_trial/Targeted_sequencing/Fastq/vep_test/config/Vortex_3167021_Covered.bed
    variantcaller: mutect2
  analysis: variant2
  description: VOR100_2_S39
  files:
  - /SAN/colcc/sarcoma_bloodcancer_raw/archive/Vortex/10_Vortex_trial/Targeted_sequencing/Fastq/vep_test/input/bams/VOR100_2_S39-sort-recal.bam
  genome_build: hg19
  metadata:
    batch: VOR100
    phenotype: normal
    samplename: VOR100_1
fc_name: vep_test
upload:
  dir: ../final

Observed behavior Error message or bcbio output:

Traceback (most recent call last):
  File "/SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command 'set -o pipefail; unset PERL5LIB && export PATH=/SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/bin:$
Possible precedence issue with control flow operator at /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/lib/site_perl/5.26.2/Bio/DB/Indexed$
Smartmatch is experimental at /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/de_novo_donor.pl line 175.
Smartmatch is experimental at /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/de_novo_donor.pl line 214.
Smartmatch is experimental at /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/splice_site_scan.pl line 191.
Smartmatch is experimental at /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/splice_site_scan.pl line 194.
Smartmatch is experimental at /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/splice_site_scan.pl line 238.
Smartmatch is experimental at /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/splice_site_scan.pl line 241.
-------------------- EXCEPTION --------------------
MSG: ERROR: No cache found for homo_sapiens_merged, version 100
STACK Bio::EnsEMBL::VEP::CacheDir::dir /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/modules/Bio/EnsEMBL/VEP/Ca$
STACK Bio::EnsEMBL::VEP::CacheDir::init /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/modules/Bio/EnsEMBL/VEP/C$
STACK Bio::EnsEMBL::VEP::CacheDir::new /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/modules/Bio/EnsEMBL/VEP/Ca$
STACK Bio::EnsEMBL::VEP::AnnotationSourceAdaptor::get_all_from_cache /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.$
STACK Bio::EnsEMBL::VEP::AnnotationSourceAdaptor::get_all /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/modules$
STACK Bio::EnsEMBL::VEP::BaseRunner::get_all_AnnotationSources /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/mo$
STACK Bio::EnsEMBL::VEP::Runner::init /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/modules/Bio/EnsEMBL/VEP/Run$
STACK Bio::EnsEMBL::VEP::Runner::run /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/share/ensembl-vep-100.4-0/modules/Bio/EnsEMBL/VEP/Runn$
STACK toplevel /SAN/colcc/pillaylab-software/bcbio-pipeline/anaconda/bin/vep:227
Date (localtime)    = Thu Oct 14 15:00:58 2021
Ensembl API version = 100
---------------------------------------------------
' returned non-zero exit status 25.

Expected behavior Annotation of VCFs using VEP

Log files Please attach (10MB max): bcbio-nextgen.log, bcbio-nextgen-commands.log, and bcbio-nextgen-debug.log.

Additional context Add any other context about the problem here.

naumenko-sa commented 3 years ago

Hi @DolapoA

VEP requires an additional data installation, with bcbio_nextgen.py upgrade -u skip --datatarget vep --genomes hg19

That installs VEP cache into bcbio/genomes/Hsapiens/hg19/vep Let me know if that works for you.

In our installation the hg19 cache is just linked from GRCh37/vep.

Sergey

DolapoA commented 3 years ago

Thank you @naumenko-sa, this worked!