Certain EC2 instances are not available in some AZs. For example, as of writing, Windows g3.4xlarge is only available in ap-south-1a & ap-south-1b, but not ap-south-1c.
For CreateWindowsDesktop, the AZ is not defined for the DryRun and actual EC2 instance creation. This results in an AZ being selected by the boto and trophosphere clients, allowing for different AZs to be selected for DryRun and EC2 instance creation.
If the dry run selects an AZ with that has availability for the EC2 instance, it will pass. However, the actual EC2 instance creation could use an different AZ, one which does not doesn't have the instance.
This causes the stack to rollback, and leaves the stack in an unclean state. Retrying the CreateWindowsDesktop with the same arguments will fail as the stack already exists.
To Reproduce
Try to create a g3.4xlarge Windows in Ap-south-1. As stated above, this instance is only available in ap-south-1a & ap-south-1b, but notap-south-1c` (as of writing)
Possible result 1: Dryrun & EC2 create both do not use ap-south-1c AZ
the creation will be successful, API returns 200
Possible scenario 2: Dryrun picks ap-south-1c AZ
API returns 400 with error message
Unable to start Desktop due to: Dry Run error: Dry run failed. Unable to launch capacity due to: An error occurred (Unsupported) when calling the RunInstances operation: Your requested instance type (g3.4xlarge Windows) is not supported in your requested Availability Zone (ap-southeast-1c). Please retry your request by not specifying an Availability Zone or choosing ap-southeast-1a, ap-southeast-1b.
Possible scenario 3: Dry run doesn't pick ap-south-1c and passes. EC2 create picks ap-south-1c AZ
API returns 200, but actual EC2 instance creation fails.
The stack event log shows a rollback caused by:
Your requested instance type (g3.4xlarge Windows) is not supported in your requested Availability Zone (ap-southeast-1c). Please retry your request by not specifying an Availability Zone or choosing ap-southeast-1a, ap-southeast-1b. (Service: AmazonEC2; Status Code: 400; Error Code: Unsupported; Request ID: REMOVED; Proxy: null)
this leaves the SOCA env in a sub-optimal state, as the stack still exists, and is named using the cluster-id, username, and session name. Future calls with the same parameters will fail, as it will try to re-create an existing stack.
Please complete the following information about the solution:
We are in the middle of addressing this issue by migrating our compute provisioning logic to EC2 Fleet. This will be shipped as part of our future releases later this year.
Describe the bug
Certain EC2 instances are not available in some AZs. For example, as of writing, Windows g3.4xlarge is only available in
ap-south-1a
&ap-south-1b
, but notap-south-1c
.For
CreateWindowsDesktop
, the AZ is not defined for the DryRun and actual EC2 instance creation. This results in an AZ being selected by the boto and trophosphere clients, allowing for different AZs to be selected for DryRun and EC2 instance creation.If the dry run selects an AZ with that has availability for the EC2 instance, it will pass. However, the actual EC2 instance creation could use an different AZ, one which does not doesn't have the instance.
This causes the stack to rollback, and leaves the stack in an unclean state. Retrying the CreateWindowsDesktop with the same arguments will fail as the stack already exists.
To Reproduce
Try to create a g3.4xlarge Windows in Ap-south-1. As stated above, this instance is only available in
ap-south-1a
&ap-south-1b, but not
ap-south-1c` (as of writing)ap-south-1c
AZap-south-1c
AZap-south-1c
and passes. EC2 create picksap-south-1c
AZPlease complete the following information about the solution: