cattle-ops / terraform-aws-gitlab-runner

Terraform module for AWS GitLab runners on ec2 (spot) instances
https://registry.terraform.io/modules/cattle-ops/gitlab-runner/aws
MIT License
586 stars 331 forks source link

spot instances are provisioning synchronously #1180

Closed kr008 closed 2 weeks ago

kr008 commented 3 months ago

Describe the bug

When there are few jobs in the queue and no active runners are available, a new spot request is only created once per minute, which leads to an increase in queue time.

To Reproduce

Run multiple pipelines at the same time. Check EC2 spot request logs:

[ { "CreateTime": "2024-08-23T08:00:22+00:00", "InstanceId": "i-006e8ca11eb9eced3", "LaunchSpecification": { "SecurityGroups": [ { "GroupName": "runner-3-docker-machine20240823055655231100000008", "GroupId": "sg-0dd6538c58ddb1c1a" } ], "BlockDeviceMappings": [ { "DeviceName": "/dev/sda1", "Ebs": { "DeleteOnTermination": true, "Iops": 6000, "VolumeSize": 40, "VolumeType": "io2", "Encrypted": false } } ], "IamInstanceProfile": { ... }, "ImageId": "ami-09634b5569ee59efb", "InstanceType": "c5.xlarge", "KeyName": "runner-52qhjpfhu-runner-3-1724400021-386e2f44", "NetworkInterfaces": [ { "AssociatePublicIpAddress": false, "DeviceIndex": 0, "SubnetId": "subnet-xxx" } ], "Placement": { "AvailabilityZone": "eu-west-1c", "Tenancy": "default" }, "Monitoring": { "Enabled": true } }, "LaunchedAvailabilityZone": "eu-west-1c", "ProductDescription": "Linux/UNIX", "SpotInstanceRequestId": "sir-ahk6zbrg", "SpotPrice": "0.500000", "State": "active", "Status": { "Code": "fulfilled", "Message": "Your Spot request is fulfilled.", "UpdateTime": "2024-08-23T08:31:21+00:00" }, "Tags": [], "Type": "one-time", "InstanceInterruptionBehavior": "terminate" }, { "CreateTime": "2024-08-23T08:01:25+00:00", "InstanceId": "i-06477b16a9de6f87e", "LaunchSpecification": { "SecurityGroups": [ { "GroupName": "runner-3-docker-machine20240823055655231100000008", "GroupId": "sg-0dd6538c58ddb1c1a" } ], "BlockDeviceMappings": [ { "DeviceName": "/dev/sda1", "Ebs": { "DeleteOnTermination": true, "Iops": 6000, "VolumeSize": 40, "VolumeType": "io2", "Encrypted": false } } ], "IamInstanceProfile": { ... }, "ImageId": "ami-09634b5569ee59efb", "InstanceType": "c5.xlarge", "KeyName": "xxx", "NetworkInterfaces": [ { "AssociatePublicIpAddress": false, "DeviceIndex": 0, "SubnetId": "subnet-xxx" } ], "Placement": { "AvailabilityZone": "eu-west-1c", "Tenancy": "default" }, "Monitoring": { "Enabled": true } }, "LaunchedAvailabilityZone": "eu-west-1c", "ProductDescription": "Linux/UNIX", "SpotInstanceRequestId": "sir-diapxicg", "SpotPrice": "0.500000", "State": "active", "Status": { "Code": "fulfilled", "Message": "Your Spot request is fulfilled.", "UpdateTime": "2024-08-23T08:31:21+00:00" }, "Tags": [], "Type": "one-time", "InstanceInterruptionBehavior": "terminate" }, { "CreateTime": "2024-08-23T08:02:34+00:00", "InstanceId": "i-036f25e908bf51b0b", "LaunchSpecification": { "SecurityGroups": [ { "GroupName": "runner-3-docker-machine20240823055655231100000008", "GroupId": "sg-0dd6538c58ddb1c1a" } ], "BlockDeviceMappings": [ { "DeviceName": "/dev/sda1", "Ebs": { "DeleteOnTermination": true, "Iops": 6000, "VolumeSize": 40, "VolumeType": "io2", "Encrypted": false } } ], "IamInstanceProfile": { ... }, "ImageId": "ami-09634b5569ee59efb", "InstanceType": "c5.xlarge", "KeyName": "runner-52qhjpfhu-runner-3-1724400153-ec32b6ad", "NetworkInterfaces": [ { "AssociatePublicIpAddress": false, "DeviceIndex": 0, "SubnetId": "subnet-xxx" } ], "Placement": { "AvailabilityZone": "eu-west-1c", "Tenancy": "default" }, "Monitoring": { "Enabled": true } }, "LaunchedAvailabilityZone": "eu-west-1c", "ProductDescription": "Linux/UNIX", "SpotInstanceRequestId": "sir-1p2pymeh", "SpotPrice": "0.500000", "State": "active", "Status": { "Code": "fulfilled", "Message": "Your Spot request is fulfilled.", "UpdateTime": "2024-08-23T08:31:21+00:00" }, "Tags": [], "Type": "one-time", "InstanceInterruptionBehavior": "terminate" }, { "CreateTime": "2024-08-23T08:03:40+00:00", "InstanceId": "i-0f8c1d77b00a167ab", "LaunchSpecification": { "SecurityGroups": [ { "GroupName": "runner-3-docker-machine20240823055655231100000008", "GroupId": "sg-0dd6538c58ddb1c1a" } ], "BlockDeviceMappings": [ { "DeviceName": "/dev/sda1", "Ebs": { "DeleteOnTermination": true, "Iops": 6000, "VolumeSize": 40, "VolumeType": "io2", "Encrypted": false } } ], "IamInstanceProfile": { ... }, "ImageId": "ami-09634b5569ee59efb", "InstanceType": "c5.xlarge", "KeyName": "runner-52qhjpfhu-runner-3-1724400219-9816b87a", "NetworkInterfaces": [ { "AssociatePublicIpAddress": false, "DeviceIndex": 0, "SubnetId": "subnet-xxx" } ], "Placement": { "AvailabilityZone": "eu-west-1c", "Tenancy": "default" }, "Monitoring": { "Enabled": true } }, "LaunchedAvailabilityZone": "eu-west-1c", "ProductDescription": "Linux/UNIX", "SpotInstanceRequestId": "sir-w7tpwy5h", "SpotPrice": "0.500000", "State": "active", "Status": { "Code": "fulfilled", "Message": "Your Spot request is fulfilled.", "UpdateTime": "2024-08-23T08:31:21+00:00" }, "Tags": [], "Type": "one-time", "InstanceInterruptionBehavior": "terminate" } ]

Expected behavior

Spot requests should be create on demand, not only maximum once per minute.

kayman-mk commented 3 months ago

Not sure if we can control this. Might be an internal of the docker-machine/autoscaler.

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 15 days.

github-actions[bot] commented 2 weeks ago

This issue was closed because it has been stalled for 15 days with no activity.