hpc-carpentry / old-hpc-workflows

Scaling studies on high-performance clusters using Snakemake workflows
https://www.hpc-carpentry.org/old-hpc-workflows/
Other
8 stars 2 forks source link

Consider Parsl #35

Closed tkphd closed 1 year ago

tkphd commented 1 year ago

NERSC's Snakemake docs lists Snakemake's "cluster mode" as a disadvantage, since it submits each "rule" as a separate job, thereby spamming the scheduler with dependent tasks. The main Snakemake process also resides on the login node until all jobs have finished, occupying some resources.

NERSC specifically documents Parsl as the recommended alternative for multinode jobs. I was aware of Parsl as a Python extension for parallel programming, but had not recognized its ability to dispatch work directly on Slurm (and possibly other schedulers).

This synergy suggests Parsl as a viable alternative to Snakemake, since it (a) would integrate readily with the Python-based Amdahl code and (b) could form the basis of a Programming for HPC lesson with thematic callbacks to this prior lesson in the workshop.

ocaisa commented 1 year ago

I like that Parsl supports Flux (see https://parsl-project.org/parslfest2021-files/corbett-flux.pdf), to me running your own scheduler is the "proper" way to execute 1000s of tasks within the confines of a large overall job. It looks quite similar to what I've done with Dask in the past (using decorators etc.), and Dask has a similar approach (but we had to write our own extension to handle MPI tasks). I will look into it more.

Having said that, wouldn't the solution getting a little complex if we switch? We'd have to explain decorators and make Python a full pre-req. Does Parsl support other schedulers?

tkphd commented 1 year ago

Having discussed this at the Coordination meeting, the Python-centric nature of Parsl (function decorators) makes it too narrowly-scoped, and would require Python as a workshop pre-requisite.

tl;dr: we're sticking with Snakemake.