nf-core / rnaseq

RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
https://nf-co.re/rnaseq
MIT License
884 stars 701 forks source link

gtf not found issue #367

Closed gurpreet-bioinfo closed 4 years ago

gurpreet-bioinfo commented 4 years ago

Hi,

I am getting the following error with qualimap while running the pipeline. I tried to run without NXF_OPTS='-Xms1g -Xmx4g' also, but same error is coming up:

nf-core/rnaseq v1.4.2
Run Name: project_rnaseq
nf-core/rnaseq execution completed unsuccessfully!

The exit status of the task that caused the workflow execution to fail was: 1.

The full error message was:

Error executing process > 'qualimap (project_rnaseq041A4_R1_mergedAligned.sortedByCoord.out)'

Caused by:
  Process `qualimap (project_rnaseq041A4_R1_mergedAligned.sortedByCoord.out)` terminated with an error exit status (1)

Command executed:

  unset DISPLAY
  qualimap --java-mem-size=16G rnaseq non-strand-specific -pe -s -bam project_rnaseq041A4_R1_mergedAligned.sortedByCoord.out.bam -gtf genes.gtf -outdir project_rnaseq041A4_R1_mergedAligned.sortedByCoord.out

Command exit status:
  1

Command output:
  Java memory size is set to 16G
  Launching application...

  QualiMap v.2.2.2-dev
  Built on 2018-12-03 16:04

  Selected tool: rnaseq

  ERROR: input region gtf file not found

  usage: qualimap rnaseq [-a ] -bam  -gtf  [-oc ] [-outdir
         ] [-outfile ] [-outformat ] [-p ] [-pe] [-s]
   -a,--algorithm              Counting algorithm:
                                    uniquely-mapped-reads(default) or
                                    proportional.
   -bam                        Input mapping file in BAM format.
   -gtf                        Annotations file in Ensembl GTF format.
   -oc                         Output file for computed counts. If only name
                                    of the file is provided, then the file will be
                                    saved in the output folder.
   -outdir                     Output folder for HTML report and raw data.
   -outfile                    Output file for PDF report (default value is
                                    report.pdf).
   -outformat                  Format of the output report (PDF, HTML or both
                                    PDF:HTML, default is HTML).
   -p,--sequencing-protocol    Sequencing library protocol:
                                    strand-specific-forward,
                                    strand-specific-reverse or non-strand-specific
                                    (default)
   -pe,--paired                     Setting this flag for paired-end experiments
                                    will result in counting fragments instead of
                                    reads
   -s,--sorted                      This flag indicates that the input file is
                                    already sorted by name. If not set, additional
                                    sorting by name will be performed. Only
                                    required for paired-end analysis.

Command wrapper:
  Java memory size is set to 16G
  Launching application...

  QualiMap v.2.2.2-dev
  Built on 2018-12-03 16:04

  Selected tool: rnaseq

  ERROR: input region gtf file not found

  usage: qualimap rnaseq [-a ] -bam  -gtf  [-oc ] [-outdir
         ] [-outfile ] [-outformat ] [-p ] [-pe] [-s]
   -a,--algorithm              Counting algorithm:
                                    uniquely-mapped-reads(default) or
                                    proportional.
   -bam                        Input mapping file in BAM format.
   -gtf                        Annotations file in Ensembl GTF format.
   -oc                         Output file for computed counts. If only name
                                    of the file is provided, then the file will be
                                    saved in the output folder.
   -outdir                     Output folder for HTML report and raw data.
   -outfile                    Output file for PDF report (default value is
                                    report.pdf).
   -outformat                  Format of the output report (PDF, HTML or both
                                    PDF:HTML, default is HTML).
   -p,--sequencing-protocol    Sequencing library protocol:
                                    strand-specific-forward,
                                    strand-specific-reverse or non-strand-specific
                                    (default)
   -pe,--paired                     Setting this flag for paired-end experiments
                                    will result in counting fragments instead of
                                    reads
   -s,--sorted                      This flag indicates that the input file is
                                    already sorted by name. If not set, additional
                                    sorting by name will be performed. Only
                                    required for paired-end analysis.

Work dir:
  /gk-project_rnaseq_rnaseq_hla_l-0/work/19/dc88664c75b6d1cc2e240f1b91f68d

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

The workflow was completed at 2020-01-18T16:18:17.033+01:00 (duration: 4h 25m 32s)

The command used to launch the workflow was as follows:

nextflow run nf-core/rnaseq -profile cfc --reads '/gk-project_rnaseq_hla_l-0/fastq/fastq_merged/*_R{1,2}_merged.fastq.gz' --genome GRCh38 --fc_group_features_type gene_id --saveReference -resume --email gurpreet.kaur@qbic.uni-tuebingen.de -name project_rnaseq

Please help. Thanks. Gurpreet

gurpreet-bioinfo commented 4 years ago

Considering the above error, I used --skipQualimap option and got error with RseQC.

Further, I ran with --skipQualimap --skipRseQC and now I am getting following error:

nf-core/rnaseq execution completed unsuccessfully!

The exit status of the task that caused the workflow execution to fail was: 255.

The full error message was:

Error executing process > 'featureCounts (1_R1_mergedAlignedByCoord.out)'

Caused by:
  Process `featureCounts (1_R1_mergedAlignedByCoord.out)` terminated with an error exit status (255)

  || Load annotation file genes.gtf ...                                         ||
  Failed to open the annotation file genes.gtf, or its format is incorrect, or it contains no 'exon' features.

The workflow was completed at 2020-01-23T19:33:58.186+01:00 (duration: 5h 37s)

The command used to launch the workflow was as follows:

nextflow run nf-core/rnaseq -r 1.4.2 -profile cfc --reads '/gk-project_rnaseq_hla_l-0/fastq/fastq_merged/*_R{1,2}_merged.fastq.gz' --genome GRCh38 --fc_group_features_type gene_id --saveReference -resume --email gurpreet.kaur@qbic.uni-tuebingen.de -name project_rnaseq8 --skipQualimap --skipRseQC

While looking at the gtf files under work dir, only one genes.gtf underSTARindex is of 268,363 kb and rest in other folders are of 1 kb.

I tried to run the pipeline many times, but still errors are coming up.

Any help will be appreciated. Thanks Gurpreet

bc2zb commented 4 years ago

I'm also running into this issue. Something seems to be up with the GTF for GRch38

bc2zb commented 4 years ago

Whoops, seems to be fixable using guidance from here