nf-core / rnaseq

RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
https://nf-co.re/rnaseq
MIT License
924 stars 709 forks source link

--extra_star_align_args #1378

Closed mtinti closed 2 months ago

mtinti commented 2 months ago

Description of the bug

Hi, I added this parameter to my pipeline --extra_star_align_args '--runThreadN 35', but it duplicates the parameter in star rather than replacing it. Here is the error I get from the pipeline:

ERROR ~ Error executing process > 'NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STAR_ALIGN (PG1)'

Caused by: Process NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STAR_ALIGN (PG1) terminated with an error exit status (102)

Command executed:

STAR \ --genomeDir star \ --readFilesIn input1/PG1_trimmed_1_val_1.fq.gz input2/PG1_trimmed_2_val_2.fq.gz \ --runThreadN 12 \ --outFileNamePrefix PG1. \ \ --sjdbGTFfile hBosTaurus.filtered.gtf \ --outSAMattrRGline 'ID:PG1' 'SM:PG1' \ --quantMode TranscriptomeSAM --twopassMode Basic --outSAMtype BAM Unsorted --readFilesCommand zcat --runRNGseed 0 --outFilterMultimapNmax 20 --alignSJDBoverhangMin 1 --outSAMattributes NH HI AS NM MD --quantTranscriptomeBan Singleend --outSAMstrandField intronMotif --outReadsUnmapped Fastx --runThreadN 35

if [ -f PG1.Unmapped.out.mate1 ]; then mv PG1.Unmapped.out.mate1 PG1.unmapped_1.fastq gzip PG1.unmapped_1.fastq fi if [ -f PG1.Unmapped.out.mate2 ]; then mv PG1.Unmapped.out.mate2 PG1.unmapped_2.fastq gzip PG1.unmapped_2.fastq fi

cat <<-END_VERSIONS > versions.yml "NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STARALIGN": star: $(STAR --version | sed -e "s/STAR//g") samtools: $(echo $(samtools --version 2>&1) | sed 's/^.samtools //; s/Using.$//') gawk: $(echo $(gawk --version 2>&1) | sed 's/^.GNU Awk //; s/, .$//') END_VERSIONS

Command exit status: 102

Command output: (empty)

Command error:

EXITING: FATAL INPUT ERROR: duplicate parameter "runThreadN" in input "Command-Line" SOLUTION: keep only one definition of input parameters in each input source

Sep 12 19:54:27 ...... FATAL ERROR, exiting

I did not test it, but could the same thing happen with --extra_salmon_quant_args?

Thanks for your attention, Michele

Command used and terminal output

nextflow run nf-core/rnaseq -r '3.15.0' \
    --save_unaligned --max_cpus 40 \
    --max_memory 80.GB \
    --input samplesheet.csv \
    --outdir /my_out_dir \
    --gtf mygtf.gtf \
    --fasta myfasta.fa \
    -profile singularity \
    -w /tmp/work \ 
    --extra_salmon_quant_args '--threads 35' \
    --save_align_intermeds

Relevant files

No response

System information

No response

MatthiasZepper commented 2 months ago

Why did you close this issue as completed?

It indeed used to be the case that there was no parameter consolidation, because in my understanding extra arguments should not replace those that are specified in the configuration, but that feature was introduced with #1248 and version 3.15. So it should have worked in your case...

mtinti commented 2 months ago

Hi, thanks for looking at it. After reading a few closed issues, I thought I was meant to alter the config file:

process { withName: 'STAR_ALIGN' { // single job memory = '80.GB' cpus = 30 } }

rather then specify --extra_salmon_quant_args '--threads 35' . If it should have worked, I'm happy to re-open it.

MatthiasZepper commented 2 months ago

Ah, yes, that behavior is indeed to be expected. It escaped my Friday afternoon brain that the argument in question was for setting the number of cores. Indeed, you need a custom config for this, since that parameter is hardcoded as --runThreadN $task.cpus in the module.

So all CPUs will be utilized that are allotted to the process and to tweak this, you need a custom config. The consolidation will only happen for all the ext.args that are specified in the default config.