BIMSBbioinfo / pigx_rnaseq

Bulk RNA-seq Data Processing, Quality Control, and Downstream Analysis Pipeline
GNU General Public License v3.0
21 stars 12 forks source link

Slurm as a queuing system for pigx-rnaseq? Or generally abstract away from gridengine? #110

Closed smoe closed 2 years ago

smoe commented 2 years ago

Hello,

About 15 years ago I contributed to https://www.nordugrid.org/ - something those (and other) folks have solved is transform abstract job descriptions to a variety of different queuing systems. Would you be prepared to abstract your current gridengine towards something that is agnostic towards the underlying interface? Alternatively, is there a way to directly add another interface of pigx_rnaseq to slurm so it can work on the typical larger clusters that Universities offer?

A quick web search brought my attention to https://pypi.org/project/qbatch/ but have not immediately grasped how they check for the termination of a job. But I expect you can just submit everything interactively.

Cheers, Steffen

rekado commented 2 years ago

Yes, we are not tied to Grid Engine and would like to make PiGx scheduler-agnostic. AFAIK snakemake itself supports submission via DRMAA, though we never got around to testing it.

We will need to keep support for Grid Engine, though --- be that via DRMAA or some other mechanism---, as this is what's used at the MDC.

smoe commented 2 years ago

I just did some reading. There are now snakemake profiles (https://snakemake.readthedocs.io/en/stable/executing/cli.html#profiles) that can describe how snakemake jobs shall be executed. And such profiles exist for Gridengine, Slurm (https://github.com/Snakemake-Profiles/slurm) and many others. This is beyond my immediate routine, as you may guess and I yet do not know whom to ask about experiences.

What I also found is the "--cluster" argument that can be used beyond a mere "yes" to instruct about the interaction with a batch system as shown on https://snakemake.readthedocs.io/en/stable/tutorial/additional_features.html#cluster-execution .