nf-core / rnaseq

RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
https://nf-co.re/rnaseq
MIT License
911 stars 706 forks source link

Handling of argument override for custom STAR options II #1046

Closed CHoeltermann closed 1 year ago

CHoeltermann commented 1 year ago

Description of the bug

Hi everyone,

I am confused since I am getting a bug that seems to have been fixed in an earlier version of nf-core rna-seq:

https://github.com/nf-core/rnaseq/issues/1002#issue-1677445038

But I encounter the same problem in version 3.12.0 ...

Any help is greatly appreciated!

Command used and terminal output

nextflow run -bg nf-core/rnaseq -r 3.12.0 --input ../data/samplesheet.csv --aligner star_salmon --extra_star_align_args '--outFilterMultimapNmax 100' --genome GRCh38 --outdir out -profile docker

EXITING: FATAL INPUT ERROR: duplicate parameter "outFilterMultimapNmax" in input "Command-Line"
SOLUTION: keep only one definition of input parameters in each input source`

Relevant files

nextflow.log

System information

Version: 23.04.1 build 5866 Created: 15-04-2023 06:51 UTC System: Linux 5.4.0-42-generic Runtime: Groovy 3.0.16 on OpenJDK 64-Bit Server VM 11.0.15-internal+0-adhoc..src Encoding: UTF-8 (UTF-8) CPUs: 72 - Mem: 251.5 GB (44.1 GB) - Swap: 8 GB (6.7 GB)

executed locally with a conda nextflow installation: nextflow 23.04.1 h2a3209d_3 bioconda

MatthiasZepper commented 1 year ago

This is a deliberate choice. My fix in #934 does not (and was never intended to) eliminate a parameter duplication with dissimilar specifications.

Providing --outFilterMultimapNmax 100 to --extra_star_align_args still causes a parameter clash, because the parameter value in the config is 20:

https://github.com/nf-core/rnaseq/blob/3bec2331cac2b5ff88a1dc71a21fab6529b57a0f/conf/modules.config#L566-L579

--extra_star_align_args is intended as a convenience parameter that should never take precedence over the actual module configuration. So it seemed more error-prone to silently overwrite the module config with something that (as the parameter name suggests) is meant to be extra. This parameter therefore has to be set via a module config.

If you strongly feel this behaviour is inconsistent and unexpected from a user perspective, I am open to reconsider. In that case, please take it to Slack, where we can get a quicker and more comprehensive user feedback.

CHoeltermann commented 1 year ago

This is truly unexpected for me.

So the fix would be to provide a yaml with the desired options via -params-file?

Thanks!

MatthiasZepper commented 1 year ago

No, that will not work either. Because a ext.args parameter that is already configured via the modules.config can only be altered by a custom tool config, like you would have to do it, if there was no extra_star_align_args parameter to the pipeline in the first place.

CHoeltermann commented 1 year ago

Thank you so much for your help! I think I will take this issue to slack, since I was very surprised by this behaviour. I think the pipeline would be improved by being able to change these default options.

MatthiasZepper commented 1 year ago

I still prefer it the way it is, but as I said, I am open to change my mind if people on Slack are clearly in favour of this change. I am even more readily persuaded, if you subsequently open a pull request that implements it!

drpatelh commented 1 year ago

Hi @CHoeltermann ! As @MatthiasZepper mentioned if you need to overwrite and existing parameter specified in modules.config then the only solution is to copy out the entire config block for that process and provide it to the pipeline via the -c option.

So in your case, the custom config would look something like this:

process {
    withName: '.*:ALIGN_STAR:STAR_ALIGN|.*:ALIGN_STAR:STAR_ALIGN_IGENOMES' {
        ext.args = [
            '--quantMode TranscriptomeSAM',
            '--twopassMode Basic',
            '--outSAMtype BAM Unsorted',
            '--readFilesCommand zcat',
            '--runRNGseed 0',
            '--outFilterMultimapNmax 100',
            '--alignSJDBoverhangMin 1',
            '--outSAMattributes NH HI AS NM MD',
            '--quantTranscriptomeBan Singleend',
            '--outSAMstrandField intronMotif'
        ].join(' ').trim()
    }
}

Don't think we are going to have a fix for this at least until we transition to using a new config format so will close this issue for now. Please feel free to join the nf-core Slack Workspace for any future questions/issues. We have an #rnaseq channel where you can get more real-time help.