Closed mrlikl closed 8 months ago
@mrlikl I was able to deploy it with cdk 2.46.0, kubernetes 1.21 and alb controller 2.4.1. Are you still having the issue?
Getting the same error when default_capacity=0, the code mentioned in the description will reproduce the error now.
@mrlikl I am running the following code to reproduce this error. Will let you know when the deploy completed.
import { KubectlV23Layer } from '@aws-cdk/lambda-layer-kubectl-v23';
import {
App, Stack,
aws_eks as eks,
aws_ec2 as ec2,
} from 'aws-cdk-lib';
const devEnv = {
account: process.env.CDK_DEFAULT_ACCOUNT,
region: process.env.CDK_DEFAULT_REGION,
};
const app = new App();
const stack = new Stack(app, 'triage-dev5', { env: devEnv });
new eks.Cluster(stack, 'Cluster', {
vpc: ec2.Vpc.fromLookup(stack, 'Vpc', { isDefault: true }),
albController: {
version: eks.AlbControllerVersion.V2_4_1,
},
version: eks.KubernetesVersion.V1_23,
kubectlLayer: new KubectlV23Layer(stack, 'LayerVersion'),
clusterLogging: [
eks.ClusterLoggingTypes.API,
eks.ClusterLoggingTypes.AUTHENTICATOR,
eks.ClusterLoggingTypes.SCHEDULER,
],
endpointAccess: eks.EndpointAccess.PUBLIC,
placeClusterHandlerInVpc: true,
clusterName: 'baking-k8s',
outputClusterName: true,
outputMastersRoleArn: true,
defaultCapacity: 0,
kubectlEnvironment: { MINIMUM_IP_TARGET: '100', WARM_IP_TARGET: '100' },
});
I am getting error with the CDK code provided above:
Lambda Log:
[ERROR] Exception: b'Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress\n'
Traceback (most recent call last):
File "/var/task/index.py", line 17, in handler
return helm_handler(event, context)
File "/var/task/helm/__init__.py", line 88, in helm_handler
helm('upgrade', release, chart, repository, values_file, namespace, version, wait, timeout, create_namespace)
File "/var/task/helm/__init__.py", line 186, in helm
raise Exception(output)
I am making this a P2 now and I will investigate a little bit more on this next week. If you have any possible solution please let me know. Any pull request would be highly appreciated as well.
I think this issue should be prioritized, a lot of other folks running into trouble when developing on sandbox.
I have seen a lot of issue in this repo which have setting default capacity 0 but did not realized it's a bug, It really impact development productivity since cloud formation template will take hours to rollback and cleanup the resource.
I have the same issue:
The error from CloudFormation is:
Received response status [FAILED] from custom resource. Message returned: Error: b'Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress\n' Logs: /aws/lambda/TestingStage-Release-awscdkawseksK-Handler886CB40B-KG9T55a3ZdwW at invokeUserFunction (/var/task/framework.js:2:6) at processTicksAndRejections (internal/process/task_queues.js:95:5) at async onEvent (/var/task/framework.js:1:365) at async Runtime.handler (/var/task/cfn-response.js:1:1543) (RequestId: 16bb84de-c183-4e1c-9e4e-cc7ec0efc5b8)
Hey @pahud. Thank you so much for looking into this.
Were you able to make any progress? I've been struggling on this for a while. Here is my latest stack Info:
"aws-cdk-lib": "2.63.0",
KubernetesVersion.V1_26
AlbControllerVersion.V2_5_1
Hi @pahud, still face the same issue.
I deployed the cdk in cn-north-1 region.
Hi @pahud , I think I found out the root cause in my scenario. It may be caused by image can not be pulled in cn-north-1 region.
Please check:
Failed to pull image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1": rpc error: code = Unknown desc = failed to pull and unpack image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1": failed to resolve reference "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1": pulling from host 602401143452.dkr.ecr.us-west-2.amazonaws.com failed with status code [manifests v2.4.1]: 401 Unauthorized
k logs aws-load-balancer-controller-75c785bc8c-72zpg -n kube-system
Error from server (BadRequest): container "aws-load-balancer-controller" in pod "aws-load-balancer-controller-75c785bc8c-72zpg" is waiting to start: trying and failing to pull image
kubectl describe pod aws-load-balancer-controller-75c785bc8c-72zpg -n kube-system
Name: aws-load-balancer-controller-75c785bc8c-72zpg
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Service Account: aws-load-balancer-controller
Node: ip-10-0-3-136.cn-north-1.compute.internal/10.0.3.136
Start Time: Mon, 17 Jul 2023 16:30:59 +0800
Labels: app.kubernetes.io/instance=aws-load-balancer-controller
app.kubernetes.io/name=aws-load-balancer-controller
pod-template-hash=75c785bc8c
Annotations: kubernetes.io/psp: eks.privileged
prometheus.io/port: 8080
prometheus.io/scrape: true
Status: Pending
IP: 10.0.3.160
IPs:
IP: 10.0.3.160
Controlled By: ReplicaSet/aws-load-balancer-controller-75c785bc8c
Containers:
aws-load-balancer-controller:
Container ID:
Image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1
Image ID:
Ports: 9443/TCP, 8080/TCP
Host Ports: 0/TCP, 0/TCP
Command:
/controller
Args:
--cluster-name=Workshop-Cluster
--ingress-class=alb
--aws-region=cn-north-1
--aws-vpc-id=vpc-0e4a9201452c76b0e
State: Waiting
Reason: ErrImagePull
Ready: False
Restart Count: 0
Liveness: http-get http://:61779/healthz delay=30s timeout=10s period=10s #success=1 #failure=2
Environment:
AWS_STS_REGIONAL_ENDPOINTS: regional
AWS_DEFAULT_REGION: cn-north-1
AWS_REGION: cn-north-1
AWS_ROLE_ARN: arn:aws-cn:iam::743271379588:role/clo-workshop-07-CLWorkshopEC2AndEKSeksClusterStack-1XO6CGEC91JGY
AWS_WEB_IDENTITY_TOKEN_FILE: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
Mounts:
/tmp/k8s-webhook-server/serving-certs from cert (ro)
/var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jct6t (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
aws-iam-token:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 86400
cert:
Type: Secret (a volume populated by a Secret)
SecretName: aws-load-balancer-tls
Optional: false
kube-api-access-jct6t:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 16m default-scheduler Successfully assigned kube-system/aws-load-balancer-controller-75c785bc8c-72zpg to ip-10-0-3-136.cn-north-1.compute.internal
Normal Pulling 14m (x4 over 16m) kubelet Pulling image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1"
Warning Failed 14m (x4 over 16m) kubelet Failed to pull image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1": rpc error: code = Unknown desc = failed to pull and unpack image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1": failed to resolve reference "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1": pulling from host 602401143452.dkr.ecr.us-west-2.amazonaws.com failed with status code [manifests v2.4.1]: 401 Unauthorized
Warning Failed 14m (x4 over 16m) kubelet Error: ErrImagePull
Warning Failed 14m (x6 over 16m) kubelet Error: ImagePullBackOff
Normal BackOff 87s (x62 over 16m) kubelet Back-off pulling image "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1"
Seems like related to https://github.com/aws/aws-cdk/issues/22520
013241004608.dkr.ecr.us-gov-west-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 151742754352.dkr.ecr.us-gov-east-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 558608220178.dkr.ecr.me-south-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 590381155156.dkr.ecr.eu-south-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.ap-northeast-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.ap-northeast-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.ap-northeast-3.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.ap-south-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.ap-southeast-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.ap-southeast-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.ca-central-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.eu-central-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.eu-north-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.eu-west-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.eu-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.eu-west-3.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.sa-east-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.us-east-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.us-east-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.us-west-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 800184023465.dkr.ecr.ap-east-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 877085696533.dkr.ecr.af-south-1.amazonaws.com/amazon/aws-load-balancer-controller:v2.4.1 918309763551.dkr.ecr.cn-north-1.amazonaws.com.cn/amazon/aws-load-balancer-controller:v2.4.1 961992271922.dkr.ecr.cn-northwest-1.amazonaws.com.cn/amazon/aws-load-balancer-controller:v2.4.1
Find a solution in https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/1694, you can manually replace the ecr template url in cloudformation.
https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases?page=2
The issue is that when the cluster is deployed with default_capacity
as 0 there will not be any nodes attached to it. While installing the aws-load-balancer-controller via helm, the status goes into pending-install
, the pods will be pending
as no nodes available to schedule pods
. The handler lambda eventually times out after 15mins and the event handler lambda will retry the installation once again. The handler lambda executes helm upgrade
and errors with Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
.
While this is expected as there are no nodes, I was testing by adding a check to kubectl-handler to see if nodes are 0 when the error is thrown and was able to handle the error. However, I am not sure if this is the right approach to solve this issue.
if b'another operation (install/upgrade/rollback) is in progress' in output:
cmd_to_run = ["kubectl","get","nodes"]
cmd_to_run.extend(['--kubeconfig', kubeconfig])
get_nodes_output = subprocess.check_output(cmd_to_run, stderr=subprocess.STDOUT,cwd=outdir)
if b'No resources found' in get_nodes_output:
return
@pahud out of interest is this still on the backlog or has it been deprioritized? Calling addnodegroupcapacity
on the cluster doesn't work with defaultcapacity: 0
so it's not possible to use launch templates to control capacity via CDK -- as far as i've tested.
I have been Stuck on creating FargateCluster with this issue since 06/22 https://github.com/aws/aws-cdk/issues/22005#issuecomment-1603053510 . Did the 'defaultCapacity' work for you? It is not an option for fargate.
Just tried with latest version of CDK today and still having this issue. It is possible to escalate this issue please?
Could someone help me i have the same issue. Here is my repo https://github.com/PavanMudigondaTR/install-karpenter-with-cdk
It's been a while and I am now testing the following code in the latest CDK
export class EksStack extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props)
// use my default VPC
const vpc = getDefaultVpc(this);
new eks.Cluster(this, 'Cluster', {
vpc,
albController: {
version: eks.AlbControllerVersion.V2_6_2,
},
version: eks.KubernetesVersion.V1_27,
kubectlLayer: new KubectlLayer(this, 'LayerVersion'),
clusterLogging: [
eks.ClusterLoggingTypes.API,
eks.ClusterLoggingTypes.AUTHENTICATOR,
eks.ClusterLoggingTypes.SCHEDULER,
],
endpointAccess: eks.EndpointAccess.PUBLIC,
placeClusterHandlerInVpc: true,
clusterName: 'baking-k8s',
outputClusterName: true,
outputMastersRoleArn: true,
defaultCapacity: 0,
kubectlEnvironment: { MINIMUM_IP_TARGET: '100', WARM_IP_TARGET: '100' },
});
}
}
For issues from @mrlikl @Karatakos @smislam @PavanMudigondaTR, I am not sure if your issues are related to this one which seems to be related with AlbController, if it doesn't come with AlbController, please open a new issue and link to this one.
@YikaiHu EKS in China is a little bit more complicated, please open a separate issue for your case in China and link to this one. Thanks.
Unfortunately I can't deploy it with the following code in my first attempt.
I am making it a p1 for now and will simplify the code hopefully to figure out the root cause.
export class EksStack extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props)
// use my default VPC
const vpc = getDefaultVpc(this);
new eks.Cluster(this, 'Cluster', {
vpc,
albController: {
version: eks.AlbControllerVersion.V2_6_2,
},
version: eks.KubernetesVersion.V1_27,
kubectlLayer: new KubectlLayer(this, 'LayerVersion'),
clusterLogging: [
eks.ClusterLoggingTypes.API,
eks.ClusterLoggingTypes.AUTHENTICATOR,
eks.ClusterLoggingTypes.SCHEDULER,
],
endpointAccess: eks.EndpointAccess.PUBLIC,
placeClusterHandlerInVpc: true,
clusterName: 'baking-k8s',
outputClusterName: true,
outputMastersRoleArn: true,
defaultCapacity: 0,
kubectlEnvironment: { MINIMUM_IP_TARGET: '100', WARM_IP_TARGET: '100' },
});
}
}
Hello @pahud, as mentioned in my previous comment, the issue is when default capacity is set to 0. Please check this comment - https://github.com/aws/aws-cdk/issues/22005#issuecomment-1742171115
Thanks @mrlikl
OK looks like the deployment of albController depends on the availability of the nodegroup. This means
albController
with defaultCapacity: 0
would fail.albController
with defaultCapacity
or nodegroup with at least 1 available node would succeed.In this case, we should avoid using albController
with no capacity or nodegroup in the initial deployment and I doubt if we can check the node availability in CDK but at least we should note this in albController
doc string.
And, there might be a chance the handler lambda might timeout before the nodes are ready and the addDependency as below might be required.
cluster.albController?.node.addDependency(cluster.defaultNodegroup!);
In terms of the EKS Fargate cluster, I am not sure if ALB controller is compatible with EKS Fargate cluster and we definitely need more tests and feedback on it. Please open a separate issue for EKS Fargate cluster with alb controller if it does have the issue because it might need different workaround.
OK I can confirm this deploys and works for me.
export class EksStack extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props)
// use my default VPC
const vpc = getDefaultVpc(this);
const cluster = new eks.Cluster(this, 'Cluster', {
vpc,
albController: {
version: eks.AlbControllerVersion.V2_6_2,
},
mastersRole: new iam.Role(this, 'MasterRole', {
assumedBy: new iam.AccountRootPrincipal(),
}),
version: eks.KubernetesVersion.V1_27,
kubectlLayer: new KubectlLayer(this, 'LayerVersion'),
defaultCapacity: 2,
});
cluster.albController?.node.addDependency(cluster.defaultNodegroup!);
}
}
And this works as well for FargateCluster
const cluster = new eks.FargateCluster(this, 'Cluster', {
vpc,
albController: {
version: eks.AlbControllerVersion.V2_6_2,
},
mastersRole: new iam.Role(this, 'MasterRole', {
assumedBy: new iam.AccountRootPrincipal(),
}),
version: eks.KubernetesVersion.V1_27,
kubectlLayer: new KubectlLayer(this, 'LayerVersion'),
});
I am making this to p2 as this error can be avoided.
This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.
issue still persists. please bot don't close the ticket
Hey @pahud, thank you so much for looking into this. I concur that the issue still persist. Here is the error:
Node: v20.10.0 Npm: 10.2.5 "aws-cdk-lib": "^2.115.0" KubernetesVersion.V1_28 AlbControllerVersion.V2_6_2
EksClusterStack | 26/28 | 9:06:12 AM | CREATE_FAILED | Custom::AWSCDK-EKS-HelmChart | EksClusterS tackEksCluster922FB9AE-AlbController/Resource/Resource/Default (EksClusterStackEksCluster922FB9AEAlbContro ller1636C356) Received response status [FAILED] from custom resource. Message returned: Error: b'Release "aws-load-balancer-controller" does not exist. Installing it now.\nError: looks like "https://aws.github.io/eks-charts" is not a valid chart reposito ry or cannot be reached: Get "https://aws.github.io/eks-charts/index.yaml": dial tcp 185.199.110.153:443: connect: connection t imed out\n'
When I add your suggestion cluster.albController?.node.addDependency(cluster.defaultNodegroup!);
, I get the following error:
$eks-cluster\node_modules\constructs\src\dependency.ts:91 const ret = (instance as any)[DEPENDABLE_SYMBOL]; ^ TypeError: Cannot read properties of undefined (reading 'Symbol(@aws-cdk/core.DependableTrait)')
@pahud, @mrlikl et. al,
I was able to resolve the issue. What I have found is that to create the egress controller, the code is getting helm files from Kubernetes sigs. To access those file, you must have egress enabled. In my case, I was creating my cluster in Private subnet. You need to create your cluster in a subnet with egress. SubnetType.PRIVATE_WITH_EGRESS
.
Please update your Cluster and your VPC configurations to see if this gets resolved for you. My Stack completed successfully.
Thank you @smislam for the insights.
This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.
@smislam SubnetType.PRIVATE_WITH_EGRESS
causes RuntimeError: There are no 'Private' subnet groups in this VPC. Available types: Isolated,Deprecated_Isolated,Public
@pahud im still getting the same error with my python code even with default_capacity
do you know where am i missing?
vpc = ec2.Vpc.from_lookup(self, "VPCLookup", vpc_id=props.vpc_id)
# provisioning a cluster
cluster = eks.Cluster(
self,
"eks-cluster",
version=eks.KubernetesVersion.V1_28,
kubectl_layer=lambda_layer_kubectl_v28.KubectlV28Layer(self, "kubectl-layer"),
cluster_name=f"{props.customer}-eks-cluster",
default_capacity_instance=ec2.InstanceType("t3.medium"),
default_capacity=2,
alb_controller=eks.AlbControllerOptions(version=eks.AlbControllerVersion.V2_6_2),
vpc=vpc,
vpc_subnets=[ec2.SubnetSelection(subnet_type=ec2.SubnetType.PRIVATE_ISOLATED)],
masters_role=iam.Role(self, "masters-role", assumed_by=iam.AccountRootPrincipal()),
)
@andreprawira
For some reason it will fail if vpc_subnets selection is ec2.SubnetType.PRIVATE_ISOLATED
as described in https://github.com/aws/aws-cdk/issues/22005#issuecomment-1866886455.
RuntimeError: There are no 'Private' subnet groups in this VPC. Available types: Isolated,Deprecated_Isolated,Public
This means CDK doesn't seem to find any "private with egress" subnets in your vpc. Can you make sure you do have private subnets with egress(typically NAT gateway)?
@andreprawira, It looks like you are using a VPC (already created in another stack) that doesn't have a private subnet with egress. And, that is why you are getting that error.
vpc = ec2.Vpc.from_lookup(self, "VPCLookup", vpc_id=props.vpc_id)
You will not be able to use CDK to create your stack with such configuration for the reason I mentioned earlier in my comment.. So, either update with your VPC to create new private subnet with Egress or create an entirely new VPC with SubnetType.PRIVATE_WITH_EGRESS
. This will require a NAT (either gateway or instance) as @pahud mentioned.
@pahud @smislam so we have a product in our service catalog that deploys VPC and IGW to all of our accounts and within that product, we dont use NAT GW, rather we use a TGW in our network account (meaning all traffic goes in and out through network account, even with the VPCs in various other accounts). That is why i did a VPC from lookup cause it has already been created.
That being said, is there another way for me to use thealb_controller
with the VPC, TGW, and IGW are already set up as is? Btw, i hope i am not misunderstanding you guys when you said i cant use ec2.SubnetType.PRIVATE_ISOLATED
because if i look at my cluster, i can see the subnets that it uses are all private subnets (the route tables for those subnets route the traffic to TGW that exists in network account, and the RT of those subnets dont route the traffic to IGW)
Furthermore, using vpc_subnets=[ec2.SubnetSelection(subnet_type=ec2.SubnetType.PRIVATE_WITH_EGRESS)]
causes RuntimeError: There are no 'Private' subnet groups in this VPC. Available types: Isolated,Deprecated_Isolated,Public
and to answer your question @pahud i could be wrong but i dont think i have private subnets with egress if it uses NAT GW, but i have a TGW, shouldnt it worked as well?
How do i use ec2.SubnetType.PRIVATE_WITH_EGRESS)]
with a TGW instead of NAT GW?
@andreprawira, Your setup should work. There is a bug in the older version of CDK that has an issue with Transit Gateway. I ran into this a while back. Any chance you are using older version of CDK?
Can you please try with latest version?
@smislam i just updated my cdk from version 2.115.0
to2.117.0
and below is my code
vpc = ec2.Vpc.from_lookup(self, "VPCLookup", vpc_id=props.vpc_id)
# provisioning a cluster
cluster = eks.Cluster(
self,
"eks-cluster",
version=eks.KubernetesVersion.V1_28,
kubectl_layer=lambda_layer_kubectl_v28.KubectlV28Layer(self, "kubectl-layer"),
# place_cluster_handler_in_vpc=True,
cluster_name=f"{props.customer}-eks-cluster",
default_capacity_instance=ec2.InstanceType("t3.medium"),
default_capacity=2,
alb_controller=eks.AlbControllerOptions(version=eks.AlbControllerVersion.V2_6_2),
vpc=vpc,
vpc_subnets=[ec2.SubnetSelection(subnet_type=ec2.SubnetType.PRIVATE_WITH_EGRESS)],
# masters_role=iam.Role(self, "masters-role", assumed_by=iam.AccountRootPrincipal()),
)
but i am still getting the same RuntimeError: There are no 'Private' subnet groups in this VPC. Available types: Isolated,Deprecated_Isolated,Public
That is strange. I am not sure what is happening @andreprawira. We will need @pahud and the AWS CDK team to look deeper into this. Happy coding and a happy New Year!
@andreprawira
I think you still can use private isolated for the vpc_subnets as below:
vpc_subnets=[ec2.SubnetSelection(subnet_type=ec2.SubnetType.PRIVATE_ISOLATED)],
But if you look at the synthesized template, there could be a chance
Technically, it is possible to deploy eks cluster with isolated subnets but there're a lot of requirements you need to consider and we don't have a working sample for now and we will need more feedback from the community before we know how to do that and add it in the document.
We have a p1 tracking issue for eks cluster with isolated support at https://github.com/aws/aws-cdk/issues/12171 - we will need to close that first but that should not relevant to albcontroller.
This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.
Describe the bug
While creating an eks cluster with eks.AlbControllerOptions, it is running into error while creating the custom resource Custom::AWSCDK-EKS-HelmChart
"Received response status [FAILED] from custom resource. Message returned: Error: b'Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress' "
Expected Behavior
Creation of the custom resource Custom::AWSCDK-EKS-HelmChart to be succesfull
Current Behavior
Custom::AWSCDK-EKS-HelmChart is running into error "Received response status [FAILED] from custom resource. Message returned: Error: b'Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress' "
Reproduction Steps
cluster = eks.Cluster( scope=self, id=construct_id, tags={"env": "production"}, alb_controller=eks.AlbControllerOptions( version=eks.AlbControllerVersion.V2_4_1 ), version=eks.KubernetesVersion.V1_21, cluster_logging=[ eks.ClusterLoggingTypes.API, eks.ClusterLoggingTypes.AUTHENTICATOR, eks.ClusterLoggingTypes.SCHEDULER, ], endpoint_access=eks.EndpointAccess.PUBLIC, place_cluster_handler_in_vpc=True, cluster_name="basking-k8s", output_masters_role_arn=True, output_cluster_name=True, default_capacity=0, kubectl_environment={"MINIMUM_IP_TARGET": "100", "WARM_IP_TARGET": "100"}, )
Possible Solution
No response
Additional Information/Context
No response
CDK CLI Version
2.40.0
Framework Version
No response
Node.js Version
16.17.0
OS
macos 12.5.1
Language
Python
Language Version
3.10.6
Other information
No response