nf-core / scrnaseq

A single-cell RNAseq pipeline for 10X genomics data
https://nf-co.re/scrnaseq
MIT License
178 stars 154 forks source link

Update to the latest simpleaf #312

Open DongzeHE opened 3 months ago

DongzeHE commented 3 months ago

Description of feature

Dear scrnaseq team,

Thank you very much for including simpleaf in scrnaseq.

Recently, we made major changes to simpleaf, including adding new features and fixing bugs.

As we noticed that currently, scrnaseq is using an old version of simpleaf, here we want to discuss the possibility of upgrading simpleaf to the latest version, and exposing the new features provided by the latest version.

Tagging @rob-p here in case I missed anything.

Best, Dongze

grst commented 3 months ago

Hi @DongzeHE,

we are of course happy to support the latest version of simpleaf and would appreciate a PR. As usual, it would be great to first update the module in nf-core/modules.

We completely reorganized our simpleaf workflow module and provided pre-built workflow templates for analyzing data from CITE-seq, 10X feature barcoding, etc.

Do you envisage any additional pipeline-level parameters would be needed to support that? Or do you think the --protocol parameter we already have is enough?

Best, Gregor

DongzeHE commented 3 months ago

Hi @grst,

Thanks for the reply! For the parameters, I think there are two ways to go:

  1. We can discuss which new parameters we should include. IMO there are two:
    • --decoy-paths in simpleaf_index: We can expose this parameter, or a parameter indicating if the provided genome file should be used as the decoy. Because of the way we designed the decoy, in the quant step, it is possible to not use the decoy part in the index, even if the decoy is used to build the index, by setting the --no-poison flag (maybe expose --no-poison as well?).
    • --no-piscem: As we support both piscem (default) and salmon as possible indexer/mapper, it would also be great if we could expose a switch.

Tagging @rob-p here in case I missed anything.

  1. We can expose all simpleaf options in a subsection of params, and assign the default value in simpleaf to them.

It would be great if you could provide some advice on which way we should go, exposing all options or only the most essential ones. Once we figure this out, I am very happy to work on this and submit a PR.

Best, Dongze

grst commented 3 months ago

sounds good. I think we should only expose the most frequently used options on the pipeline level (and those that require an additional input file). Users can still set arbitrary tool options via a config file, e.g. e.g.

process {
    withName: SIMPLEAF {
         ext.args = "--no-piscem"
    }
}