oracle-quickstart / oci-hpc

Terraform examples for deploying HPC clusters on OCI
Universal Permissive License v1.0
30 stars 21 forks source link

Slurm enters provisioning loop even when resources are unavailable #22

Open cbutakoff opened 1 year ago

cbutakoff commented 1 year ago

Not sure if this can be resolved, but I wonder if it would be possible to check if the nodes are available before provisioning the cluster network rather than provisioning and then waiting for an error.

arnaudfroidmont commented 6 months ago

When provisioning the Cluster Network, the first step is a reservation of the nodes that will fail almost right away.