insitro / redun

Yet another redundant workflow engine
https://insitro.github.io/redun/
Apache License 2.0
516 stars 43 forks source link

SLURM/HPC executor #25

Open multimeric opened 2 years ago

multimeric commented 2 years ago

From what I'm reading in the docs, the 3 executors are AWS Batch, AWS Glue, and local. However for HPC users it would be helpful to have a dedicated executor that submits tasks to that queueing system. A slightly easier way to do this in Python might be to just make a dask executor, and since dask has implementations for many platforms (e.g. http://jobqueue.dask.org/en/latest/), you kind of get this for free.

mattrasmus commented 2 years ago

Thanks @multimeric for posting this issue. You are correct that we intend to add additional executors over time and HPC clusters is indeed an important use case. Piggy backing off of Dask to get multiple executor backends at once is a great idea to investigate. Thanks for sharing!

Hoeze commented 1 year ago

Hi @mattrasmus, I would be interested in trying out redun as an alternative for snakemake, but according to the documentation the only viable way to use redun at scale is by running it on AWS. This is a big no-go.

Are there any updates on a SLURM / HPC executor? Is it possible to configure custom executors like e.g. Snakemake allows?

Andrew-S-Rosen commented 1 year ago

+1. I'd also recommend considering the use of PSI/J to streamline such an addition: https://github.com/ExaWorks/psij-python. It is a lightweight dependency with a unified interface to various job schedulers, including up-and-coming ones.