The C2_CPUS quote exceeded when submitting a job to a slurm cluster deployed on google cloud platform.
Current Behaviour
When submitting a job to the cluster, I got the following error:
"Quota C2_CPUS exceeded. Limit: 24.0 in region us-west1.". Details: "[{message: "Quota C2_CPUS exceeded. Limit: 24.0 in region us-west1.", domain: usageLimits, reason: quotaExceeded}]"
Possible Solution
Can we increase this Quote or point me to another zone with higher C2_CPUS limit?
Steps to Reproduce
The steps of creating a new SLURM cluster, install the dependencies and run a job are detailed in Doc
You can use the created cluster on foss-fpga-tools-ext-vtr project on GCP. The cluster name is vtr-batch-cluster. Add your private key to SSH keys section for both login and control node to be able to ssh to the nodes. Make sure to create the ssh key with username slurm. Then, follow the steps of running a job on login node that detailed in Doc
Context
I am trying to build a SLURM cluster on the google cloud platform (GCP) to be used in batch VTR running. The cluster is deployed successfully. However, the C2_CPUS limit is causing the nodes to be down. Then, the job fails.
The C2_CPUS quote exceeded when submitting a job to a slurm cluster deployed on google cloud platform.
Current Behaviour
When submitting a job to the cluster, I got the following error:
"Quota C2_CPUS exceeded. Limit: 24.0 in region us-west1.". Details: "[{message: "Quota C2_CPUS exceeded. Limit: 24.0 in region us-west1.", domain: usageLimits, reason: quotaExceeded}]"
Possible Solution
Can we increase this Quote or point me to another zone with higher C2_CPUS limit?
Steps to Reproduce
The steps of creating a new SLURM cluster, install the dependencies and run a job are detailed in Doc
You can use the created cluster on foss-fpga-tools-ext-vtr project on GCP. The cluster name is vtr-batch-cluster. Add your private key to SSH keys section for both login and control node to be able to ssh to the nodes. Make sure to create the ssh key with username slurm. Then, follow the steps of running a job on login node that detailed in Doc
Context
I am trying to build a SLURM cluster on the google cloud platform (GCP) to be used in batch VTR running. The cluster is deployed successfully. However, the C2_CPUS limit is causing the nodes to be down. Then, the job fails.