fedora-copr / copr

RPM build system - upstream for https://copr.fedorainfracloud.org/
108 stars 56 forks source link

Configure Fedora Copr to take EC2 workers from multiple locations #3279

Open praiskup opened 1 month ago

praiskup commented 1 month ago

This is currently causing problems to the powerful SPOT builders, that often fail to allocate:

An error occurred (InsufficientInstanceCapacity) when calling the RunInstances operation (reached max retries: 2): We currently do not have sufficient c7i.8xlarge capacity in zones with support for 'gp3' volumes. Our system will be working on provisioning additional capacity.

See also https://github.com/praiskup/resalloc-aws/pull/9

@kwk @FrostyX fyi

praiskup commented 1 month ago

Tried with io1:

We currently do not have sufficient c7i.8xlarge capacity in zones with support for 'io1'

Tried with st1:

An error occurred (Unsupported) when calling the RunInstances operation: Your requested instance type (c7i.8xlarge) is not supported in your requested Availability Zone (us-east-1a). Please retry your request by not specifying an Availability Zone or choosing us-east-1b, us-east-1c, us-east-1d, us-east-1e, us-east-1f.

Changing the location doesn't help:

An error occurred (InsufficientInstanceCapacity) when calling the RunInstances operation (reached max retries: 2): We currently do not have sufficient c7i.8xlarge capacity in the Availability Zone you requested (us-east-1d). Our system will be working on provisioning additional capacity. You can currently get c7i.8xlarge capacity by not specifying an Availability Zone in your request or choosing us-east-1b, us-east-1c, us-east-1e, us-east-1f

Once more:

An error occurred (InsufficientInstanceCapacity) when calling the RunInstances operation (reached max retries: 2): We currently do not have sufficient c7i.8xlarge capacity in the Availability Zone you requested (us-east-1b). Our system will be working on provisioning additional capacity. You can currently get c7i.8xlarge capacity by not specifying an Availability Zone in your request or choosing us-east-1c, us-east-1d, us-east-1e, us-east-1f.

Yet these seem to be available in those locations:


$ resalloc-aws-minimal-spot-zone --instance-type c7i.8xlarge --region us-east-1
us-east-1b: 0.683400
us-east-1c: 0.771200
us-east-1d: 0.688900
us-east-1e: 0.687700
us-east-1f: 0.692000
praiskup commented 1 month ago

I switched this to https://instances.vantage.sh/aws/ec2/c7a.8xlarge temporarily (hotfix, get's removed with the next playbook run).

praiskup commented 1 month ago

Per @kwk's report, we still have troubles with those builders - and I suspect the reason is that we allocate SPOT instances that get terminated after some time. This needs some debugging, but for now, I'm going to configure Copr to use on-demand instances. The rest will be done next week.

praiskup commented 4 weeks ago

https://pagure.io/fedora-infra/ansible/c/5acd7c627f87a8aed82bb6329ea8925dcb55a579