nf-core / rnaseq

RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
https://nf-co.re/rnaseq
MIT License
873 stars 697 forks source link

Salmon quant not run after FastQ subsampling if index not provided #919

Closed freedog8 closed 1 year ago

freedog8 commented 1 year ago

Description of the bug

only few part of the pipeline were run

Command used and terminal output

$ nextflow run nf-core/rnaseq --input samplesheet.csv --outdir results --igenomes_ignore -resume --gencode --genome hg38

N E X T F L O W  ~  version 22.10.4
WARN: It appears you have never run this project before -- Option `-resume` is ignored
Launching `https://github.com/nf-core/rnaseq` [cheeky_hilbert] DSL2 - revision: adce7ce9ab [master]

------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/rnaseq v3.10-gadce7ce
------------------------------------------------------
Core Nextflow options
  revision       : master
  runName        : cheeky_hilbert
  launchDir      : /data/gpfs02/wliu/YaohuiHe/RNA-seq
  workDir        : /data/gpfs02/wliu/YaohuiHe/RNA-seq/work
  projectDir     : /data/gpfs01/wliu/.nextflow/assets/nf-core/rnaseq
  userName       : wliu
  profile        : standard
  configFiles    : /data/gpfs01/wliu/.nextflow/config, /data/gpfs01/wliu/.nextflow/assets/nf-core/rnaseq/nextflow.config, /data/gpfs02/wliu/YaohuiHe/RNA-seq/nextflow.config

Input/output options
  input          : samplesheet.csv
  outdir         : results

Reference genome options
  genome         : hg38
  fasta          : /data/gpfs01/wliu/genome/alias/hg38/fasta/default/hg38.fa
  gtf            : /data/gpfs01/wliu/genome/alias/hg38/gtf/gencode_v36/hg38.gtf
  gene_bed       : /data/gpfs01/wliu/genome/alias/hg38/bed12/gencode_v36/hg38.bed
  star_index     : /data/gpfs01/wliu/genome/alias/hg38/star/2.7.10b/.
  gencode        : true
  igenomes_ignore: true

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use nf-core/rnaseq for your analysis please cite:

* The pipeline
  https://doi.org/10.5281/zenodo.1400710

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://github.com/nf-core/rnaseq/blob/master/CITATIONS.md
------------------------------------------------------
executor >  lsf (36)
[ab/794090] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_GENE_FILTER (hg38.fa)                                [100%] 1 of 1 ✔
[50/a83456] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:MAKE_TRANSCRIPTS_FASTA (rsem/hg38.fa)                    [100%] 1 of 1 ✔
[49/adde5d] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES (hg38.fa)                           [100%] 1 of 1 ✔
[28/0b58ac] process > NFCORE_RNASEQ:RNASEQ:INPUT_CHECK:SAMPLESHEET_CHECK (samplesheet.csv)                         [100%] 1 of 1 ✔
[-        ] process > NFCORE_RNASEQ:RNASEQ:CAT_FASTQ                                                               -
[4c/38d5bb] process > NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:FQ_SUBSAMPLE (S22082245)                      [100%] 30 of 30 ✔
[-        ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC                                 -
[-        ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE                             -
[-        ] process > NFCORE_RNASEQ:RNASEQ:MULTIQC_TSV_FAIL_TRIMMED                                                -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STAR_ALIGN                                                   -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_SORT                        -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_INDEX                       -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS    -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAGSTAT -
[-        ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXSTATS -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_QUANT                                       -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_TX2GENE                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_TXIMPORT                                    -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_SE_GENE                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_SE_GENE_LENGTH_SCALED                       -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_SE_GENE_SCALED                              -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_SE_TRANSCRIPT                               -
[-        ] process > NFCORE_RNASEQ:RNASEQ:DESEQ2_QC_STAR_SALMON                                                   -
[-        ] process > NFCORE_RNASEQ:RNASEQ:MULTIQC_TSV_FAIL_MAPPED                                                 -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:PICARD_MARKDUPLICATES                         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:SAMTOOLS_INDEX                                -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS             -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAGSTAT          -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXSTATS          -
[-        ] process > NFCORE_RNASEQ:RNASEQ:STRINGTIE_STRINGTIE                                                     -
[-        ] process > NFCORE_RNASEQ:RNASEQ:SUBREAD_FEATURECOUNTS                                                   -
[-        ] process > NFCORE_RNASEQ:RNASEQ:MULTIQC_CUSTOM_BIOTYPE                                                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BEDTOOLS_GENOMECOV                                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_FORWARD:UCSC_BEDCLIP                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_FORWARD:UCSC_BEDGRAPHTOBIGWIG         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_REVERSE:UCSC_BEDCLIP                  -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_REVERSE:UCSC_BEDGRAPHTOBIGWIG         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:QUALIMAP_RNASEQ                                                         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:DUPRADAR                                                                -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_BAMSTAT                                                 -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_INNERDISTANCE                                           -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_INFEREXPERIMENT                                         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONANNOTATION                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONSATURATION                                      -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_READDISTRIBUTION                                        -
[-        ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_READDUPLICATION                                         -
[-        ] process > NFCORE_RNASEQ:RNASEQ:MULTIQC_TSV_STRAND_CHECK                                                -
[c2/b1e3da] process > NFCORE_RNASEQ:RNASEQ:CUSTOM_DUMPSOFTWAREVERSIONS (1)                                         [100%] 1 of 1 ✔
[96/13ca20] process > NFCORE_RNASEQ:RNASEQ:MULTIQC (1)                                                             [100%] 1 of 1 ✔
-[nf-core/rnaseq] Pipeline completed successfully-
Completed at: 03-Jan-2023 12:37:17
Duration    : 6m 22s
CPU hours   : 2.7
Succeeded   : 36

Relevant files

No response

System information

No response

drpatelh commented 1 year ago

Hi @freedog8 ! Thanks for reporting. Can you please provide the (redacted) .nextflow.log file for the run any custom config files as well as the samplesheet you used as input please?

freedog8 commented 1 year ago

@drpatelh I cannot provide the log file as it has been replaced. But when I add the parameter "--pseudo-aligner 'salmon'" the pipeline goes further.

drpatelh commented 1 year ago

Ok. No worries. Are you able to provide the samplesheet you used? You can change the paths. I mainly need to see the format. As well as any custom config files.

drpatelh commented 1 year ago

Actually, if you used -resume when you re-ran the pipeline the original log file may be there. The top-level log file will have been replaced but there may be one with a numeric suffix for the old run depending on how many times you have resumed e.g.

$ cd /to/directory/where/you/ran/the/pipeline/
$ ls -la .nextflow.log*
-rw-r--r--   1 harshil harshil 348932 Dec 21 12:03 .nextflow.log    <--- top-level log file
-rw-r--r--   1 harshil harshil 118193 Dec 21 12:02 .nextflow.log.1  <--- old log files from previous runs
-rw-r--r--   1 harshil harshil  76967 Dec 21 12:00 .nextflow.log.2  <--- old log files from previous runs
-rw-r--r--   1 harshil harshil 298644 Dec 20 16:54 .nextflow.log.3  <--- old log files from previous runs
freedog8 commented 1 year ago

nextflow.log nextflow.log.1.log nextflow.log.2.log nextflow.log.3.log nextflow.log.4.log nextflow.log.5.log nextflow.log.6.log nextflow.config.txt samplesheet.csv

drpatelh commented 1 year ago

I was able to reproduce with the minimal parameters below using the latest code on dev:

Command

nextflow run . -profile docker -params-file ./params.yml

params.yml

input: '/home/harshil/profile_test/samplesheet_test.csv'
outdir: './results/'
fasta: '/home/harshil/profile_test/genome.fasta.gz'
gtf: '/home/harshil/profile_test/genes.gtf.gz'

This will be fixed in https://github.com/nf-core/rnaseq/pull/921

The subsampling subworkflow expects a Salmon index in order to run Salmon quant, however, the Salmon index will only be created when --pseudo_aligner salmon is provided. If you don't use this parameter then the index channel passed to the Salmon quant subworkflow will be empty and hence all of the downstream steps will be skipped.