aws-samples / aws-eda-slurm-cluster

AWS Slurm Cluster for EDA Workloads
MIT No Attribution
28 stars 7 forks source link

[FEATURE] Add multi-region support for ParallelCluster #150

Open cartalla opened 1 year ago

cartalla commented 1 year ago

Is your feature request related to a problem? Please describe.

The legacy version supported compute nodes in multiple AZs and regions. I don't think that orchestrating compute nodes in multiple regions from a single cluster is likely to be implemented by ParallelCluster. Would still like to be able for jobs that can't run because of capacity limitations to be able to run in a different region where capacity is available.

Describe the solution you'd like From talking to SchedMD it may be possible to use federated clusters in different regions and prioritize them somehow. Need further investigation before can propose a concrete solution.

Currently, it's unclear what the demand for this would be so if you need this capability then please comment.