nf-core / rnaseq

RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
https://nf-co.re/rnaseq
MIT License
911 stars 706 forks source link

invalid literal for int() with base 10: '0.79;logic_name=cpg' #1215

Closed ZabalaAitor closed 5 months ago

ZabalaAitor commented 9 months ago

Description of the bug

I am running nf-core/rnaseq with the following command:

nextflow run nf-core/rnaseq
   -r 3.14.0  
   -profile docker   
  --input /media/unidad/Expansion/Raw_Data/RNAseq_MS/samplesheet.csv   
  --outdir /media/unidad/Expansion/analysis_MS_nfcore/results/rnaseq   
  --fasta /media/unidad/Expansion/analysis_MS_nfcore/database/genomes/GRCh38/genome.fasta  
  --gtf /media/unidad/Expansion/analysis_MS_nfcore/database/genomes/GRCh38/genes.gtf   
  --transcript_fasta /media/unidad/Expansion/analysis_MS_nfcore/database/genomes/GRCh38/transcript.fasta   
  --star_index /media/unidad/Expansion/analysis_MS_nfcore/database/indexes/GRCh38/STAR  
  --salmon_index /media/unidad/Expansion/analysis_MS_nfcore/database/indexes/GRCh38/SALMON   --aligner star_salmon   -w /mnt/MyBook/work  
  --max_cpus 6   
  --max_memory 50.GB   
  --save_unaligned 
  -resume`

and I obtained the following error:

`-[nf-core/rnaseq] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONSATURATION (HCMS01)'

Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONSATURATION (HCMS01)` terminated with an error exit status (1)

Command executed:

  junction_saturation.py \
      -i HCMS01.markdup.sorted.bam \
      -r genes.bed \
      -o HCMS01 \

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONSATURATION":
      rseqc: $(junction_saturation.py --version | sed -e "s/junction_saturation.py //g")
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  Unable to find image 'quay.io/biocontainers/rseqc:5.0.3--py39hf95cd2a_0' locally
  5.0.3--py39hf95cd2a_0: Pulling from biocontainers/rseqc
  642efca944a0: Already exists
  bd9ddc54bea9: Already exists
  4778d18b334d: Pulling fs layer
  4778d18b334d: Download complete
  4778d18b334d: Pull complete
  Digest: sha256:9fc7027efc23a9dd2309ead1d285034da324eae64e6e940cb0760f4525fa28aa
  Status: Downloaded newer image for quay.io/biocontainers/rseqc:5.0.3--py39hf95cd2a_0
  WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
  [E::idx_find_and_load] Could not retrieve index file for 'HCMS01.markdup.sorted.bam'
  reading reference bed file:  genes.bed  ...  Invalid bed line (skipped): 1    0   248956422   chromosome:1    .   .   GRCh38  chromosome  .   ID=chromosome:1;Alias=CM000663.2,chr1,NC_000001.11
   Traceback (most recent call last):
    File "/usr/local/bin/junction_saturation.py", line 97, in <module>
      main()
    File "/usr/local/bin/junction_saturation.py", line 83, in main
      obj.saturation_junction(outfile=options.output_prefix, refgene=options.refgene_bed, sample_start=options.percentile_low_bound,sample_end=options.percentile_up_bound,sample_step=options.percentile_step,min_intron=options.minimum_intron_size,recur=options.minimum_splice_read, q_cut = options.map_qual)
    File "/usr/local/lib/python3.9/site-packages/qcmodule/SAM.py", line 3928, in saturation_junction
      exon_starts = list(map( int, fields[11].rstrip( ',\n' ).split( ',' ) ))
  ValueError: invalid literal for int() with base 10: '0.79;logic_name=cpg'

Work dir:
  /mnt/MyBook/work/66/dcc3242a95a1e8cabbb89d3b6c34cf

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details

the gtf file was obtained from ENSEMBL:

wget -L ftp://ftp.ensembl.org/pub/release-${latest_release}/gtf/homo_sapiens/Homo_sapiens.GRCh38.${latest_release}.gtf.gz

and the gene.bed file was created from the gtf file using GTF2BED:

process GTF2BED {
    tag "$gtf"
    label 'process_low'

    conda "conda-forge::perl=5.26.2"
    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
        'https://depot.galaxyproject.org/singularity/perl:5.26.2' :
        'biocontainers/perl:5.26.2' }"

    input:
    path gtf

    output:
    path '*.bed'       , emit: bed
    path "versions.yml", emit: versions

    when:
    task.ext.when == null || task.ext.when

    script: // This script is bundled with the pipeline, in nf-core/rnaseq/bin/
    """
    gtf2bed \\
        $gtf \\
        > ${gtf.baseName}.bed

    cat <<-END_VERSIONS > versions.yml
    "${task.process}":
        perl: \$(echo \$(perl --version 2>&1) | sed 's/.*v\\(.*\\)) built.*/\\1/')
    END_VERSIONS
    """

}

Command used and terminal output

No response

Relevant files

No response

System information

No response

pinin4fjords commented 5 months ago

@ZabalaAitor I suspect an incompatibility between FASTA and GTF- could you confirm where the FASTA you're using came from please?

ZabalaAitor commented 5 months ago

Yes, it was because of an incompatibility between the FASTA and GTF files.

Thanks!