Open trondhindenes opened 1 year ago
From another issue, it looks like the library in some cases print a warning:
You created a cluster with Kubernetes Version 1.23 without specifying the kubectlLayer property
But I've never seen that warning. Was it removed in a newer version maybe? IMHO it needs to be easy to build rock-solid clusters with cdk.
According to the document:
The version of kubectl used must be compatible with the Kubernetes version of the cluster. kubectl is supported within one minor version (older or newer) of Kubernetes (see Kubernetes version skew policy). Only version 1.20 of kubectl is available in aws-cdk-lib. If you need a different version, you will need to use one of the @aws-cdk/lambda-layer-kubectl-vXY packages.
But I agree with you we probably should implement a check to avoid potential error like that.
I am making this a p2 feature request and any PR would be appreciated!
@pahud According to this reply
According to the document:
The version of kubectl used must be compatible with the Kubernetes version of the cluster. kubectl is supported within one minor version (older or newer) of Kubernetes (see Kubernetes version skew policy). Only version 1.20 of kubectl is available in aws-cdk-lib. If you need a different version, you will need to use one of the @aws-cdk/lambda-layer-kubectl-vXY packages.
But I agree with you we probably should implement a check to avoid potential error like that.
I am making this a p2 feature request and any PR would be appreciated!
when I am trying to use @aws-cdk/lambda-layer-kubectl-v25
package with @aws-quickstart/eks-blueprints
in GenericClusterProvider with the property of kubectlLayer then it shows error like Type 'typeof KubectlV25Layer' is missing the following properties from type 'ILayerVersion': layerVersionArn, addPermission, stack, env, and 2 more
. Below is the code
...
import { KubectlV25Layer } from "@aws-cdk/lambda-layer-kubectl-v25";
....
.....
.....
const clusterProvider = new EksBlueprint.GenericClusterProvider({
version: this.props.version,
kubectlLayer: KubectlV25Layer,
vpcSubnets: [{ subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS }],
managedNodeGroups: [
{
id: `${id}-nodegroup`,
minSize: 1,
maxSize: 2,
instanceTypes: config.InstanceTypes.map(
(instance_type) => new ec2.InstanceType(instance_type)
),
},
],
});
.....
.....
CC: @menakakarichiyappakumar
@ShankarDhandapani looks like you need to instantiate it like:
const kubectl = new KubectlV25Layer(this, 'KubectlLayer');
I am currently struggling with the same issue.
This solution does not seem to apply to the v2 of the AWS-CDK.
We probably can add the validation here
I guess the challenge is that the lambda.ILayerVersion does not have any attribute of the kubectl version so it's not easy to compare that.
Thanks for this thread.
Describe the bug
Ever since we upgraded from Kubernetes 1.21 to newer versions, we're getting lots of weird errors related to what I believe are kubectl layer incompatibilities, like
3:40:15 PM | UPDATE_FAILED | Custom::AWSCDK-EKS-KubernetesResource | clusterAwsAuthmanifestB57F2A94 Received response status [FAILED] from custom resource. Message returned: Error: b'configmap/aws-auth configured\nerror: error retrieving RESTMappings to prune: invalid resource extensions/v1bet a1, Kind=Ingress, Namespaced=true: no matches for kind "Ingress" in version "extensions/v1beta1"\n'
It would be much better if cdk actually validated the layer version vs the intended kubernetes version when synthesising, so that these issues didn't occur
Expected Behavior
cdk should error out, informing me that the selected cluster version doesn't match the configured layer
Current Behavior
No validation occurs, which leads to lots of errors when trying to change the cluster later
Reproduction Steps
- create cluster version 1.23
- make a change, such as add a node group
- witness the layer error described above
Possible Solution
No response
Additional Information/Context
No response
CDK CLI Version
2.67.0
Framework Version
2.66.1
Node.js Version
v18.14.2
OS
Ubuntu
Language
Python
Language Version
3.9
Other information
No response
Thanks for starting this thread. I was running into the same issue, but I was able to fix it following the suggestions posted here.
I am using CDK v2 and I see that my kubectl version is at its latest. I don't know my cdk is not validating the Kubectl version. Is anyone working on fixing this? Any idea on when will this issue be fixed where it can take the related versions for kubectlLayer based on the kubernetes version provided.
I imported the KubectlLambdaLayer package from here.
`import { KubectlV26Layer } from '@aws-cdk/lambda-layer-kubectl-v26';
kubectlLayer: new KubectlV26Layer(this, 'KubectlLayer'),`
I've seen this error several times while attempting to update resources created with cluster.add_manifest()
.
It appears cloud formation is attempting to use a mismatched api version from what is actually deployed. E.g. attempting to use batch/v1beta1
rather than batch/v1
.
Full error response
Received response status [FAILED] from custom resource. Message returned: Error: b'serviceaccount/user created\nerror: error retrieving RESTMappings to prune: invalid resource batch/v1beta1, Kind=CronJob, Namespaced=true: no matches for kind "CronJob" in version "batch/v1beta1"\n' Logs: /aws/lambda/Application-awscdka-Handler886CB40B-q8TSqd5FvHp8 at invokeUserFunction (/var/task/framework.js:2:6) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async onEvent (/var/task/framework.js:1:369) at async Runtime.handler (/var/task/cfn-response.js:1:1573) (RequestId: 1ffd3898-6f7f-49a7-b97d-83518c0dc5fe)
When it occurs, it leaves the stack in an UPDATE_ROLLBACK_FAILED
state and there no way to stabilize the stack again. I've had to destroy and recreate my entire cluster every time.
Running Kubernetes 1.29.
I've deployed a 1.29 EKS cluster via cdk and specify the kubectlLayer as KubectlV29Layer() when creating the cluster and having the same issue as @graydenshand where the only way to get changes applied is to destroy and deploy again. This blocks just about any management of the cluster.
From the lambda kubectl layer logs:
[ERROR] Exception: b'service/serviceXYZ configured\nerror: error retrieving RESTMappings to prune: invalid resource batch/v1beta1, Kind=CronJob, Namespaced=true: no matches for kind "CronJob" in version "batch/v1beta1"\n' Traceback (most recent call last): File "/var/task/index.py", line 14, in handler return apply_handler(event, context) File "/var/task/apply/__init__.py", line 69, in apply_handler kubectl('apply', manifest_file, *kubectl_opts) File "/var/task/apply/__init__.py", line 91, in kubectl raise Exception(output)
We are experiencing the same problem
To make matters worse for us, it appears that KubectlV29
was never released by the GO cdk lib from cdklabs/awscdk-kubectl-go
, leaving us with few options to resolve this gracefully.
https://github.com/cdklabs/awscdk-kubectl-go/commits/kubectl.29
@graydenshand @benjamin-at-greensky
Are you able to reproduce this issue for us? For example, after initially create a 1.29 cluster with kubectl v29 layer, what could cause this error after that?
@kriscoleman Can you create a new issue and provide your CDK in Go code snippet in the issue description?
@pahud I have been able to reproduce this by deploying a fresh EKS cluster with kubectlLayer set to v29 and then redeploying a helm chart with updated values.
import { KubectlV29Layer } from '@aws-cdk/lambda-layer-kubectl-v29';
const clusterProps: GsEksClusterProps = {
...
kubectlLayer: new KubectlV29Layer(this, 'KubectlLayer'),
...
}
this.cluster = new eks.Cluster(this, 'EksCluster', {
...
kubectlLayer: clusterProps.kubectlLayer
...
});
After this I will make an update to the cdk that deploys a helm chart (for example I was redeploying one with some annotations on an ingress). I then receive this error when running a cdk deploy:
10:06:27 AM | UPDATE_FAILED | Custom::AWSCDK-EKS-KubernetesResource | Clustermanifestrep...63A40109 Received response status [FAILED] from custom resource. Message returned: Error: b'configmap/start-override configured\nerror: error retrieving RESTMappings to prune: invalid resource bat ch/v1beta1, Kind=CronJob, Namespaced=true: no matches for kind "CronJob" in version "batch/v1beta1"\n'
I have no CronJob's deployed to the cluster:
$ kubectl get cronjob -A
No resources found
$ kubectl api-resources | grep cronjob
cronjobs cj batch/v1 true CronJob
It is worth mentioning the helm chart I'm deploying has no references to batch/v1beta1 anywhere.
I had the same issue and defining,
from aws_cdk.lambda_layer_kubectl_v28 import KubectlV28Layer
cluster = eks.Cluster(
self,
'EksCluster',
version=eks.KubernetesVersion.V1_28,
kubectl_layer=KubectlV28Layer(self, "KubectlLayer"),
)
solved my issue.
I am using @aws-cdk/lambda-layer-kubectl-v30
and KubernetesVersion.V1_30
and I got the same issue as @graydenshand and @benjamin-at-greensky mentioned above when updating resources. The only workaround is to delete and re-create the application and related resources, which is completely impossible for the production environment.
4:02:20 PM | UPDATE_FAILED | Custom::AWSCDK-EKS-KubernetesResource | ImportedClusterman...aDployment5DA7DFEB
Received response status [FAILED] from custom resource. Message returned: Error: b'deployment.apps/********** configured\nerror: error retrieving RESTMappings to prune: invalid resource batch/v1beta1, Kind=CronJob, Namespaced=true: no matches for kind "CronJob" in version "batch/v1beta1"\n'
Can someone please look into this issue? It's been a while and it technically blocked us from using EKS at the moment.
I had the same issue and defining,
from aws_cdk.lambda_layer_kubectl_v28 import KubectlV28Layer
cluster = eks.Cluster( self, 'EksCluster', version=eks.KubernetesVersion.V1_28, kubectl_layer=KubectlV28Layer(self, "KubectlLayer"), )
solved my issue.
I tried to create a new cluster in version 1.28 and use KubectlV28Layer, but still got the same error.
@tchcxp
The issue occurs due to the kubectlLayer
, specifically the kubectl version in the handler lambda. It seems the cluster is imported, which leads to the following error:
UPDATE_FAILED | Custom::AWSCDK-EKS-KubernetesResource | ImportedClusterman...aDployment5DA7DFEB
Received response status [FAILED] from custom resource. Message returned: Error: b'deployment.apps/********** configured\nerror: error retrieving RESTMappings to prune: invalid resource batch/v1beta1, Kind=CronJob, Namespaced=true: no matches for kind "CronJob" in version "batch/v1beta1"\n'
By default, if you don't specify the layer version, it will default to version 20.0.
To resolve this, you need to set the kubectl layer again:
eks.Cluster.fromClusterAttributes(this, 'ImportedCluster', {
clusterName: clusterName,
kubectlRoleArn: kubectlRoleArn,
blah: blah,
kubectlLayer: new KubectlV28Layer(this, `kubectl-v28-layer`), // <---
});
This should address the issue.
Describe the bug
Ever since we upgraded from Kubernetes 1.21 to newer versions, we're getting lots of weird errors related to what I believe are kubectl layer incompatibilities, like
It would be much better if cdk actually validated the layer version vs the intended kubernetes version when synthesising, so that these issues didn't occur
Expected Behavior
cdk should error out, informing me that the selected cluster version doesn't match the configured layer
Current Behavior
No validation occurs, which leads to lots of errors when trying to change the cluster later
Reproduction Steps
Possible Solution
No response
Additional Information/Context
No response
CDK CLI Version
2.67.0
Framework Version
2.66.1
Node.js Version
v18.14.2
OS
Ubuntu
Language
Python
Language Version
3.9
Other information
No response