nf-core / modules

Repository to host tool-specific module files for the Nextflow DSL2 community!
https://nf-co.re/modules
MIT License
276 stars 690 forks source link

new subworkflow: distributed computing GRIDSS subworkflow #4498

Open johnoooh opened 10 months ago

johnoooh commented 10 months ago

Is there an existing subworkflow for this?

Is there an open PR for this?

Is there an open issue for this?

Are you going to work on this?

johnoooh commented 10 months ago

So there is currently a GRIDSS module in nf-core/modules but this method can be slow when running very complex tumor-normal pairs. In a HPC environment distributed computing can speed up this calling. GRIDSS has commands that allow for distributed computing but they should be broken up into different modules. Unfortunately these distributed jobs should be run in the same working directory, which is something nextflow cant really do due to the nature of the work directory. Many intermediate files are created by the individual steps and they must be passed from one process to another and then renamed to the proper names. I have done it this way and it works but I am of course open to suggestions on better ways to do this.