nf-core / rnavar

gatk4 RNA variant calling pipeline
https://nf-co.re/rnavar
MIT License
34 stars 31 forks source link

SNPEFF_SNPEEF fails due to not finding snpEffPredictor.bin file #121

Closed nschcolnicov closed 7 months ago

nschcolnicov commented 7 months ago

Description of the bug

Running the pipeline indicating to use the snpeff tool failed with this error:


Caused by:
  Process `NFCORE_RNAVAR:RNAVAR:VCF_ANNOTATE_ALL:VCF_ANNOTATE_SNPEFF:SNPEFF_SNPEFF (sample)` terminated with an error exit status (255)

Command executed:

  snpEff \
      -Xmx88473M \
      GRCh38.105 \
      -nodownload -canon -v \
      -csvStats sample.haplotypecaller.filtered_snpEff.csv \
      -dataDir ${PWD}/snpeff_cache \
      sample.haplotypecaller.filtered.vcf.gz \
      > sample.haplotypecaller.filtered_snpEff.ann.vcf

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNAVAR:RNAVAR:VCF_ANNOTATE_ALL:VCF_ANNOTATE_SNPEFF:SNPEFF_SNPEFF":
      snpeff: $(echo $(snpEff -version 2>&1) | cut -f 2 -d ' ')
  END_VERSIONS

Command exit status:
  255

Command output:
  (empty)

Command error:
  00:00:00 SnpEff version SnpEff 5.1d (build 2022-04-19 15:49), by Pablo Cingolani
  00:00:00 Command: 'ann'
  00:00:00 Reading configuration file 'snpEff.config'. Genome: 'GRCh38.105'
  00:00:00 Reading config file: snpEff.config
  00:00:00 Reading config file: /usr/local/share/snpeff-5.1-2/snpEff.config
  00:00:01 done
  00:00:01 Reading database for genome version 'GRCh38.105' from file 'snpeff_cache/GRCh38.105/snpEffectPredictor.bin' (this might take a while)
  java.lang.RuntimeException:   ERROR: Cannot read file 'snpeff_cache/GRCh38.105/snpEffectPredictor.bin'.
        You can try to download the database by running the following command:
                java -jar snpEff.jar download GRCh38.105

        at org.snpeff.snpEffect.SnpEffectPredictor.load(SnpEffectPredictor.java:48)
        at org.snpeff.snpEffect.Config.loadSnpEffectPredictor(Config.java:680)
        at org.snpeff.SnpEff.loadDb(SnpEff.java:499)
        at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:890)
        at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:875)
        at org.snpeff.SnpEff.run(SnpEff.java:1141)
        at org.snpeff.SnpEff.main(SnpEff.java:160)
  00:00:01 Logging
  00:00:03 Checking for updates...
  00:00:05 Done.

When checking the workdir, the file exists but the path where it expects to find it is snpeff_cache/GRCh38.105/snpEffectPredictor.bin but is actually in snpeff_cache/GRCh38.105/GRCh38.105/snpEffectPredictor.bin

Command used and terminal output

nextflow run ../../main.nf -profile bi,cluster --input ../input.csv --genome GRCh38 --outdir . -c ../config.config --annotate_tools vep -resume --vep_cache 's3://annotation-cache/vep_cache/110_GRCh38/'

Relevant files

The config.config file defines the fasta, fasta_fai, dict, gtf, dbsnp and dbsnp_tbi files that I'm using, which work without any issues if snpEff is not used

System information

dev branch

nschcolnicov commented 7 months ago

Created a PR for this: https://github.com/nf-core/rnavar/pull/124

nschcolnicov commented 7 months ago

This issue was fixed with the PR, we can close it