FNNDSC / pman

A process management system written in python
MIT License
22 stars 33 forks source link

Preemptible schedulers #208

Open jennydaman opened 2 years ago

jennydaman commented 2 years ago

On E2 (BCH internal SLURM), there is a partition available for preemptible jobs. pman should have support for interacting with preemptible schedulers, and retrying interrupted jobs.

The same logic could also be applied to Kubernetes, where Kubernetes would want to reschedule a pod under certain circumstances (node down, OOMKilled).