nf-core / rnaseq

RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
https://nf-co.re/rnaseq
MIT License
911 stars 706 forks source link

RSEQC_JUNCTIONANNOTATION error: too many values to unpack #1064

Closed SamMod1 closed 1 year ago

SamMod1 commented 1 year ago

Description of the bug

I am trying to run rnaseq but it keeps failing at the same step: NFCORE_RNASEQ:RNASEQ:RSEQC:RSEQC_JUNCTIONANNOTATION It fails giving an exit status of 1, but no error message seems to be detailing anything wrong in the main log file besides the warning: WARNING: Skipping mount /var/apptainer/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container

I'm running this on our cluster using slurm. I've also been using the singularity profile.

Command used and terminal output

Main command to launch rnaseq:

sbatch << EOF
#!/bin/bash -e
#SBATCH --job-name=SAM_RNA
#SBATCH --cpus-per-task=16
#SBATCH --mem=32G
#SBATCH --time=23:59:59

module load nextflow/22.10.4

#nextflow pull nf-core/rnaseq
nextflow run nf-core/rnaseq --input $SAMPLESHEET --outdir $OUTDIR --fasta $INPUT_GENOME --gff $GENOME_ANNOTATION --clip_r1=1 --clip_r2=1 -profile singularity
#nextflow run nf-core/rnaseq -profile test,singularity --outdir $OUTDIR

EOF

_________________________________________________________________________________________
Error message in the log file (which is repeated multiple times in the log file at different points):

-[nf-core/rnaseq] Pipeline completed with errors-
Error executing process > 'NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONANNOTATION (142_SEX-FEMALE_STAGE-5)'

Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONANNOTATION (142_SEX-FEMALE_STAGE-5)` terminated with an error exit status (1)

Command executed:

  junction_annotation.py \
      -i 142_SEX-FEMALE_STAGE-5.markdup.sorted.bam \
      -r fixed_new_annotation.bed \
      -o 142_SEX-FEMALE_STAGE-5 \
       \
      2> 142_SEX-FEMALE_STAGE-5.junction_annotation.log

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONANNOTATION":
      rseqc: $(junction_annotation.py --version | sed -e "s/junction_annotation.py //g")
  END_VERSIONS

Command exit status:
  1

Command output:
  total = 13227417

Command error:
  INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
  INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
  WARNING: Skipping mount /var/apptainer/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container

Work dir:
  /powerplant/workspace/cfnsjm/my_files/C_auratus/transcriptomics/rnaseq/work/85/84ceed6f2e2bb7af85ac1608b77ad9

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

Relevant files

slurm-1729803.txt nextflow.log

System information

Nextflow: nextflow/22.10.4

rmaseq: nf-core/rnaseq v3.12.0-g3bec233

Hardware: Slurm Cluster

Profile: Singularity

OS: "CentOS Linux" VERSION="7 (Core)"

SamMod1 commented 1 year ago

Update: I have found an error message in the junction_annotation.log file in the work process directory. It says the following:

Reading reference bed file: fixed_new_annotation.bed ... Done Load BAM file ... Done

=================================================================== Total splicing Events: 11244137 Known Splicing Events: 10461617 Partial Novel Splicing Events: 226475 Novel Splicing Events: 543000 Filtered Splicing Events: 13045 Traceback (most recent call last): File "/usr/local/bin/junction_annotation.py", line 171, in main() File "/usr/local/bin/junction_annotation.py", line 149, in main obj.annotate_junction(outfile=options.output_prefix,refgene=options.ref_gene_model,min_intron=options.min_intron, q_cut = options.map_qual) File "/usr/local/lib/python3.7/site-packages/qcmodule/SAM.py", line 3832, in annotate_junction (chrom, i_st, i_end) = i.split(":") ValueError: too many values to unpack (expected 3)

SamMod1 commented 1 year ago

Issue fixed. junction_annotation.py errors if there are colons ':' in the sequence id field of a gff. I fixed the issue by replacing all of the ':' in the sequence id column with underscores.