Open hjoliver opened 11 months ago
Conventional batch systems such as Slurm, PBS, though typically deployed at scale, could also be used to manage local instances.
The issue and complexity of setup is not necessarily as bad as you might think, here's a couple of containers I found on Docker Hub:
See also this issue: https://github.com/cylc/cylc-flow/issues/3800
If you use Cylc on a laptop or workstation (i.e., not even a simple cluster), then I think installing a full-blown batch system is a big ask. Even if it is "not necessarily as bad as you might think" (which doesn't sound great, to be honest 😁) figuring out how to use it in such a minimal way might not be easy.
I think #3800 is as good as we're able to get working out of the box. I.E. for this case, use Cylc's ability to gather host metrics to prevent the host from becoming overwhelmed.
After that, no matter what we do the user is going to have to install batch system and start a daemon.
(Note even at
isn't necessarily installed, if it is the daemon isn't necessarily running, and even then it might still require additional configuration, e.g. MacOS deliberately disables the at
service for security reasons.)
Installing and starting a docker slurm cluster might be as simple as docker up slurm
!
Another popular choice would be Kubernetes which can be installed locally to develop cloud deployments.
For single-user installations with no batch system, it would be useful to have an admin-free way to limit activity across multiple workflows.
Cylc internal queues help, but they don't see other workflows.
In this sort of situation, you don't really need sophisticated resource management, a simple job queuing system would probably suffice, perhaps with limiting based on simple server load metrics.
Cylc has long supported the basic
at
scheduler, but only forat now
instant job submission, which is now no better than our built-in background job management.Ideas:
atd batch
The
at
scheduler has abatch
command that only releases jobs if server load is below some limit. It would be trivial for Cylc to support this. However:atd
only releases batch jobs one at a time, once per minute 🤯atd
starts upgnu Task Spooler
An old project that has been resurrected recently-ish. Might be worth considering.
other??