nf-core / circrna

circRNA quantification, differential expression analysis and miRNA target prediction of RNA-Seq data
https://nf-co.re/circrna
MIT License
44 stars 24 forks source link

GTF file not found with CIRIquant #155

Open ZabalaAitor opened 4 months ago

ZabalaAitor commented 4 months ago

Description of the bug

Hello,

I recently encountered an error while running CIRIquant when providing a specific GTF file. Other tools, such as circRNA_finder, run without any issues and perform the annotation correctly.

Thank you very much for your time and assistance in resolving this issue.

Best regards,

Aitor Zabala

Command used and terminal output

nextflow pull nf-core/circrna

nextflow run nf-core/circRNA \
    -r dev \
    -profile apptainer \
    --input /data/azabala/validation_GTF/data/samplesheet_eGenomes.csv \
    --phenotype /data/azabala/validation_GTF/data/phenotype_eGenomes.csv \
    --module circrna_discovery \
    --outdir /scratch/azabala/validation_GTF/eGenomes \
    --tool ciriquant,circrna_finder \
    --max_cpus 24 \
    --max_memory 256GB \
    -w /scratch/azabala/validation_GTF/eGenomes/work_eGenomes \
    --genome GRCh38 \
    --gtf /data/azabala/gtf/eGenomes/genes.gtf \
    --save_reference false \
    -resume

Core Nextflow options
  revision       : dev
  runName        : evil_mclean
  containerEngine: apptainer
  launchDir      : /scratch/azabala/validation_GTF
  workDir        : /scratch/azabala/validation_GTF/eGenomes/work_eGenomes
  projectDir     : /home/azabala/.nextflow/assets/nf-core/circRNA
  userName       : azabala
  profile        : apptainer
  configFiles    : 

Input/output options
  input          : /data/azabala/validation_GTF/data/samplesheet_eGenomes.csv
  outdir         : /scratch/azabala/validation_GTF/eGenomes
  phenotype      : /data/azabala/validation_GTF/data/phenotype_eGenomes.csv

Pipeline Options
  tool           : ciriquant,circrna_finder

Reference genome options
  save_reference : false
  genome         : GRCh38
  fasta          : s3://ngi-igenomes/igenomes//Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa
  gtf            : /data/azabala/gtf/eGenomes/genes.gtf
  mature         : s3://ngi-igenomes/igenomes//Homo_sapiens/NCBI/GRCh38/Annotation/SmallRNA/mature.fa

Max job request options
  max_cpus       : 24
  max_memory     : 256GB

41/575925] NOTE: Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (exons_AGGT_eGenomes)` terminated with an error exit status (1) -- Execution is retried (2)
[30/e51473] NOTE: Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (exons_eGenomes)` terminated with an error exit status (1) -- Execution is retried (2)
[8a/b210b2] NOTE: Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (introns_eGenomes)` terminated with an error exit status (1) -- Execution is retried (2)
[b2/dfdd4b] NOTE: Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (introns_AGGT_eGenomes)` terminated with an error exit status (1) -- Execution is retried (2)
ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (exons_AGGT_eGenomes)'

Caused by:
  Process `NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT (exons_AGGT_eGenomes)` terminated with an error exit status (1)

Command executed:

  CIRIquant \
      -t 24 \
      -1 exons_AGGT_eGenomes_1_val_1.fq.gz \
      -2 exons_AGGT_eGenomes_2_val_2.fq.gz \
      --config travis.yml \
      --no-gene \
      -o exons_AGGT_eGenomes \
      -p exons_AGGT_eGenomes

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_CIRCRNA:CIRCRNA:CIRCRNA_DISCOVERY:CIRIQUANT":
      bwa: $(echo $(bwa 2>&1) | sed 's/^.*Version: //; s/Contact:.*$//')
      ciriquant : $(echo $(CIRIquant --version 2>&1) | sed 's/CIRIquant //g' )
      samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
      stringtie: $(stringtie --version 2>&1)
      hisat2: 2.1.0
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/usr/local/bin/CIRIquant", line 10, in <module>
      sys.exit(main())
    File "/usr/local/lib/python2.7/site-packages/CIRIquant/main.py", line 89, in main
      config = check_config(check_file(args.config_file))
    File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 91, in check_config
      globals()[i.upper()] = check_file(config['reference'][i])
    File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 49, in check_file
      raise ConfigError('File: {}, not found'.format(file_name))
  CIRIquant.utils.ConfigError: File: /data/azabala/gtf/eGenomes/genes.gtf, not found

Work dir:
  /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/a4/1ad695fb7ab3bc5077c78c8686692a

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details

Relevant files

travis.yaml name: ciriquant tools: bwa: /usr/local/bin/bwa hisat2: /usr/local/bin/hisat2 stringtie: /usr/local/bin/stringtie samtools: /usr/local/bin/samtools

reference: fasta: /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/stage-5e8255a4-53a0-4b1c-8881-9481ba39060a/1c/522d8e9bfc6e1560dad67f3afd1164/genome.fa gtf: /data/azabala/gtf/eGenomes/genes.gtf bwa_index: /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/c3/0410f7c6a5f85b8d7b25e17307668d/bwa/genome hisat_index: /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/11/08674c9acbe0f7ebc38718c559e64a/hisat2/genome

command.log Traceback (most recent call last): File "/usr/local/bin/CIRIquant", line 10, in sys.exit(main()) File "/usr/local/lib/python2.7/site-packages/CIRIquant/main.py", line 89, in main config = check_config(check_file(args.config_file)) File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 91, in check_config globals()[i.upper()] = check_file(config['reference'][i]) File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 49, in check_file raise ConfigError('File: {}, not found'.format(file_name)) CIRIquant.utils.ConfigError: File: /data/azabala/gtf/eGenomes/genes.gtf, not found

System information

Nextflow: 23.04.2 Hardware: HPC Executor: slurm Conatiner: Apptainer OS: Linux nf-core/circrna: dev

nictru commented 4 months ago

Hey, looks weird to me, I will need some help with debugging this.

Could you try switching to /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/a4/1ad695fb7ab3bc5077c78c8686692a und run bash .command.run?

Would be interesting if this error occurs if the script is executed like this or not

ZabalaAitor commented 4 months ago

bash .command.run

Traceback (most recent call last): File "/usr/local/bin/CIRIquant", line 10, in sys.exit(main()) File "/usr/local/lib/python2.7/site-packages/CIRIquant/main.py", line 89, in main config = check_config(check_file(args.config_file)) File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 91, in check_config globals()[i.upper()] = check_file(config['reference'][i]) File "/usr/local/lib/python2.7/site-packages/CIRIquant/utils.py", line 49, in check_file raise ConfigError('File: {}, not found'.format(file_name)) CIRIquant.utils.ConfigError: File: /data/azabala/gtf/eGenomes/genes.gtf, not found

nictru commented 4 months ago

This is really strange. The pipeline should fail if the file would not exist already here. The ciriQuant-internal check if the file exists takes place here in a pretty standard way, so I would say it's unlikely there is a problem within the tool. So the only remaining explanation is that the GTF file is not properly mounted to the container at runtime.

Could you try the following:

  1. Copy the GTF file into the working directory /scratch/azabala/validation_GTF/eGenomes/work_eGenomes/a4/1ad695fb7ab3bc5077c78c8686692a
  2. Change the .run.sh, replacing the /data/azabala/gtf/eGenomes/genes.gtf with genes.gtf
  3. Run bash .command.run again

And let me know what happens please.

Also it would be nice if you could attach the .command.run file here

ZabalaAitor commented 4 months ago

Now it seems that CIRIquant is able to read the GTF file. I am providing you with the following files for further inspection #155.tar.gz:

Please note that .command.run and .command.err may appear as hidden files.