aws-samples / cdk-eks-karpenter

CDK construct for installing and configuring Karpenter on EKS clusters
Apache License 2.0
34 stars 14 forks source link

Helm upgrade for karpenter version 0.37.0 is failing with context deadline exceeded. #173

Open IndhumithaR opened 1 month ago

IndhumithaR commented 1 month ago

Hi, When I am trying to upgrade karpenter version from v0.29.1 to 0.37.0 I am getting context deadline exceeded error.

Received response status [FAILED] from custom resource. Message returned: Error: b'Error: UPGRADE FAILED: context deadline exceeded\n'

I even tried to upgrade to 0.33.1. Facing the same issue. We guess the helm chart upgradation is taking more time than expected but not very sure. Is there anyways we can solve this issue? Can we increase the timeout time for helm upgrade?

andskli commented 1 month ago

It would be helpful to understand which resource is failing here, at least to confirm it's the Helm chart, do you have additional logs from the Lambda function which backs the custom resource?

IndhumithaR commented 1 month ago

Hi,

This is the lambda function logs,

LAMBDA_WARNING: Unhandled exception. The most likely cause is an issue in the function code. However, in rare cases, a Lambda runtime update can cause unexpected function behavior. For functions using managed runtimes, runtime updates can be triggered by a function change, or can be applied automatically. To determine if the runtime has been updated, check the runtime version in the INIT_START log entry. If this error correlates with a change in the runtime version, you may be able to mitigate this error by temporarily rolling back to the previous runtime version. For more information, see https://docs.aws.amazon.com/lambda/latest/dg/runtimes-update.html
[ERROR] Exception: b'Error: UPGRADE FAILED: context deadline exceeded\n' Traceback (most recent call last):   File "/var/task/index.py", line 17, in handler

[ERROR] Exception: b'Error: UPGRADE FAILED: context deadline exceeded\n'
Traceback (most recent call last):
  File "/var/task/index.py", line 17, in handler
    return helm_handler(event, context)
  File "/var/task/helm/__init__.py", line 93, in helm_handler
    helm('upgrade', release, chart, repository, values_file, namespace, version, wait, timeout, create_namespace)
  File "/var/task/helm/__init__.py", line 199, in helm
    raise Exception(output)

Thanks