nf-core / rnasplice

rnasplice is a bioinformatics pipeline for RNA-seq alternative splicing analysis
https://nf-co.re/rnasplice
MIT License
44 stars 24 forks source link

Exception: Event ENSG00000004961 not found in pickled directory index. Are you sure this is the right directory for the event? #154

Closed mdozmorov closed 2 weeks ago

mdozmorov commented 3 weeks ago

Description of the bug

This issue appears in various forms in other issues, but I could not find a workable solution. This error occurs when running the pipeline with --genome GRCh38 or with manually provided --fasta and --gtf (I use those from 10x Genomics).

-[nf-core/rnasplice] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI (1)'

Caused by:
  Process `NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI (1)` terminated with an error exit status (1)

Command executed:

  sashimi_plot --plot-event ENSG00000004961 index miso_settings.txt --output-dir sashimi

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASPLICE:RNASPLICE:VISUALISE_MISO:MISO_SASHIMI":
      python: $(python --version | sed "s/Python //g")
      misopy: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('misopy').version)")
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  INFO:    Converting SIF file to temporary sandbox...
  /usr/local/lib/python2.7/site-packages/matplotlib/cbook/deprecation.py:107: MatplotlibDeprecationWarning: The mpl_toolkits.axes_grid module was deprecated in version 2.1. Use mpl_toolkits.axes_grid1 and mpl_toolkits.axisartist provies the same functionality instead.
    warnings.warn(message, mplDeprecation, stacklevel=1)
  Traceback (most recent call last):
    File "/usr/local/bin/sashimi_plot", line 11, in <module>
      sys.exit(main())
    File "/usr/local/lib/python2.7/site-packages/misopy/sashimi_plot/sashimi_plot.py", line 276, in main
      plot_label=plot_label)
    File "/usr/local/lib/python2.7/site-packages/misopy/sashimi_plot/sashimi_plot.py", line 142, in plot_event
      %(event_name, pickle_dir)
  Exception: Event ENSG00000004961 not found in pickled directory index. Are you sure this is the right directory for the event?
  INFO:    Cleaning up image...

Work dir:
  /lustre/home/mdozmorov/data/WorkData/Tony/Krista/RPE1_GFP_MYCN_DAC_72hrs/work/fe/1da2e608fc7aac0f8f63541f62c08b

A workaround is to set --sashimi_plot false. But is there a better solution?

Command used and terminal output

export NXF_OPTS='-Xms8g -Xmx24g'
export NXF_SINGULARITY_CACHEDIR=/lustre/home/mdozmorov/singularity_cache
DIRIN=/lustre/home/mdozmorov/data/WorkData/
INPUT=${DIRIN}/samplesheet.csv
DIROUT=${DIRIN}/OUT_full
GENOME=/lustre/home/mdozmorov/data/ExtData/10x/refdata-gex-GRCh38-2024-A/fasta/genome.fa
GTF=/lustre/home/mdozmorov/data/ExtData/10x/refdata-gex-GRCh38-2024-A/genes/genes.gtf

nextflow run nf-core/rnasplice \
   --input ${INPUT} \
   --contrasts contrastsheet.csv \
   --fasta ${GENOME} \
   --gtf ${GTF} \
   --outdir ${DIROUT} \
   -profile singularity \
   --sashimi_plot false \
   -resume

Relevant files

No response

System information

nextflow version 23.10.0.5891, run on HPC in local mode, Singularity, Rocky Linux 9.4, nf-core/rnasplice: 1.0.4

KTMD-plant commented 3 weeks ago

It also occurs when run with TAIR10; on the slack there has been some discussion on it, but I did not see a solution yet.

tud03125 commented 2 weeks ago

I was one of them. Started talking in Slack, and then transitioned to this GitHub discussion. Long chats. But, this is it: https://github.com/nf-core/rnasplice/issues/151.

jma1991 commented 2 weeks ago

Hi @mdozmorov ,

It looks like the genome annotation you're using might be Ensembl-based, which includes gene version identifiers. If that's the case, you'll need to specify the gene identifiers along with the version number. You can refer to my discussion with @tud03125 for an example of a working solution that addresses this.

Hope this helps!

mdozmorov commented 2 weeks ago

Thanks, @jma1991, it worked. I hope this will help others:

  1. Look inside the GTF file and find an Ensembl ID of interest, like gene_id "ENSG00000186092"; gene_version "7";. Or, select any Ensembl ID.
  2. Add it, with the version number, as an argument to the pipeline, --miso_genes ENSG00000186092.7. This should prevent the error and the pipeline will complete.

It appears. --miso_genes would be good a good required setting. The error message from the pipeline is not very informative.