break-through-cancer / btc-spatial-pipelines

Nextflow spatial pipelines (for Spatial Transcriptomics, IMC, etc.)
MIT License
0 stars 1 forks source link

Initialize nf-core Template, Add BayesTME Module #1

Closed jeffquinn-msk closed 1 year ago

oandrefonseca commented 1 year ago

On a quick look.

  1. Importing a subworkflow in the modules folder. It seems a bit messy. I would also keep a de facto copy in the pipeline. I guess you could also add it to the BTC Catalog repository;

  2. On that note, as Atul said I would bring the modules and everything properly on the pipeline - module and subworkflow on the respective folders;

  3. I would write a few Nextflow profiles based on distinct HPC settings.

jeffquinn-msk commented 1 year ago

Updated so that I only import modules from the modules folder

I think 3. is out of scope for this PR, but for do you have an example for what that would look like from the rna pipeline?

In terms of just running on slurm or LSF I feel like nf-core handles all that already? I have appropriately tagged all my processes with the relevant nf-core standard resource tags, which should allow for appropriate assignment of resources..

oandrefonseca commented 1 year ago

Hi Jeff,

As for the third point, I do have this example on my dev branch, https://github.com/WangLab-ComputationalBiology/btc-scrna-pipeline/blob/dev/conf/mdanderson.config.

I created this profile to ensure that queue requirements match the task settings. Why that? I increased the task settings (CPU and memory) to speed up the runs in the HPC. Since this is an HPC-related profile I am using as much resource as possible.

On that note, my understanding is that settings on the nf-core modules are "reference" more than reality -- I can say it is based on the Cellranger ones.

Yes, Nextflow handles SLURM/LSF, but there are some configurations that you might want to set up explicitly, e.g., perJobMemLimit on LSF. Also, Module load commands and any other "requirement" could be added to the profiles. The last one could be HPC-specific (versions).

Finally, I suppose it will be nice to have Institution-based profiles already in place. I mean in order to ensure portability across HPC meanwhile, we do have the Cloud provider.

-- I might be overthinking, haha --

jeffquinn-msk commented 1 year ago

As for the third point, I do have this example on my dev branch, https://github.com/WangLab-ComputationalBiology/btc-scrna-pipeline/blob/dev/conf/mdanderson.config.

Ah ok, understood. I'll address this in a future PR