IBM-Cloud / hpc-cluster-lsf

IBM Spectrum LSF - IBM Cloud
https://cloud.ibm.com/docs/ibm-spectrum-lsf?topic=ibm-spectrum-lsf-getting-started-tutorial
Apache License 2.0
10 stars 9 forks source link

Enable LSF Resource Connector to fail over to other zones in the MZR #22

Open danielo2023 opened 1 year ago

danielo2023 commented 1 year ago

Background: LSF on IBM Cloud Schematic provisions an LSF cluster into a VPC with a specific zone in an MZR; when LSF Resource Connector requests additional compute nodes they are only provisioned in the MZR Zone where LSF was originally provisioned, this results in a non-MZR like environment where everything is provisioned in a single zone, thus if the zone does not have specified resources in that zone the Resource Connector Fails to provision the nodes and simply continues to retry over and over again causing the workload to be delayed until current resources come available meeting the resource needs.

Requirement: Please enhance LSF Resource Connector to enable deployment option that would roll over to other zones after N retries to provision in the primary zone.