Closed pmankad96 closed 1 year ago
My default VPC had a private subnet with no route to the Internet. Cloud 9 instance kept getting instantiated in that instance. As a workaround, I added a parameter to allow to select the desired subnet for Cloud 9 instance. I think the subnet parameter will be a good addition to have.
Curious if there is an update on this issue?
I ran into a similar issue with the Cloud9 Instance failing. Here is what I am seeing the CloudWatch logs:
Possible workaround: I ran the command to create the Cloud9 environment with --disable-rollback. After the failure, I was able to re-run the lambda manually and it seemed to complete successfully according to CloudWatch. I am still checking to make sure that it worked entirely.
timestamp,message
1695234457604,"INIT_START Runtime Version: python:3.9.v30 Runtime Version ARN: arn:aws:lambda:us-west-2::runtime:86d4ce088432216337acec891c716c30002d0ed911f5a9574e36052e7527d6ab
"
1695234457864,"START RequestId: a61cd434-47db-4fe3-9030-233546c48bbc Version: $LATEST
"
1695234457865,"dict_values(['Create', 'arn:aws:lambda:us-west-2:024309034029:function:eks-workshop-ide-EksWorkshopC9BootstrapInstanceLam-aofnILpRQOED', 'https://cloudformation-custom-resource-response-uswest2.s3-us-west-2.amazonaws.com/arn%3Aaws%3Acloudformation%3Aus-west-2%3A024309034029%3Astack/eks-workshop-ide/ff9cb6f0-57e2-11ee-9865-029e3d5a0f25%7CEksWorkshopC9BootstrapInstanceLambda%7C35422731-405b-45fd-81fd-ba82ef2caeb5?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230920T182737Z&X-Amz-SignedHeaders=host&X-Amz-Expires=7200&X-Amz-Credential=AKIA54RCMT6SBGALGB7S%2F20230920%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=d4e3cbf2c62cca5244e28468f38f576a1eee0dadc00b6b132dcc8185f0aad862', 'arn:aws:cloudformation:us-west-2:024309034029:stack/eks-workshop-ide/ff9cb6f0-57e2-11ee-9865-029e3d5a0f25', '35422731-405b-45fd-81fd-ba82ef2caeb5', 'EksWorkshopC9BootstrapInstanceLambda', 'Custom::EksWorkshopC9BootstrapInstanceLambda', {'ServiceToken': 'arn:aws:lambda:us-west-2:024309034029:function:eks-workshop-ide-EksWorkshopC9BootstrapInstanceLam-aofnILpRQOED', 'Cloud9Name': 'eks-workshop-ide-opensearch', 'LabIdeInstanceProfileName': 'eks-workshop-ide-EksWorkshopC9InstanceProfile-FgANiviqchtH', 'LabIdeInstanceProfileArn': 'arn:aws:iam::024309034029:instance-profile/eks-workshop-ide-EksWorkshopC9InstanceProfile-FgANiviqchtH', 'EnvironmentId': '1a7009aa3b3d4535aa6e95458e2bae25', 'SsmDocument': 'eks-workshop-ide-EksWorkshopC9SSMDocument-WHLm7hseBpum', 'REGION': 'us-west-2'}])
"
1695234480489,"Traceback (most recent call last):
"
1695234480489,"File ""/var/task/index.py"", line 89, in lambda_handler
"
1695234480489,"waiter.wait(
"
1695234480489,"File ""/var/runtime/botocore/waiter.py"", line 55, in wait
"
1695234480489,"Waiter.wait(self, **kwargs)
"
1695234480489,"File ""/var/runtime/botocore/waiter.py"", line 375, in wait
"
1695234480489,"raise WaiterError(
"
1695234480489,"botocore.exceptions.WaiterError: Waiter CommandExecuted failed: Waiter encountered a terminal failure state: For expression ""Status"" we matched expected path: ""Failed""
"
1695234480489,"https://cloudformation-custom-resource-response-uswest2.s3-us-west-2.amazonaws.com/arn%3Aaws%3Acloudformation%3Aus-west-2%3A024309034029%3Astack/eks-workshop-ide/ff9cb6f0-57e2-11ee-9865-029e3d5a0f25%7CEksWorkshopC9BootstrapInstanceLambda%7C35422731-405b-45fd-81fd-ba82ef2caeb5?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230920T182737Z&X-Amz-SignedHeaders=host&X-Amz-Expires=7200&X-Amz-Credential=AKIA54RCMT6SBGALGB7S%2F20230920%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=d4e3cbf2c62cca5244e28468f38f576a1eee0dadc00b6b132dcc8185f0aad862
"
1695234480489,"Response body:
"
1695234480489,"{""Status"": ""FAILED"", ""Reason"": ""See the details in CloudWatch Log Stream: 2023/09/20/[$LATEST]2a12197b296c4fa3aeecd16fe6ebff4f"", ""PhysicalResourceId"": ""CustomResourcePhysicalID"", ""StackId"": ""arn:aws:cloudformation:us-west-2:024309034029:stack/eks-workshop-ide/ff9cb6f0-57e2-11ee-9865-029e3d5a0f25"", ""RequestId"": ""35422731-405b-45fd-81fd-ba82ef2caeb5"", ""LogicalResourceId"": ""EksWorkshopC9BootstrapInstanceLambda"", ""NoEcho"": false, ""Data"": {}}
"
1695234480708,"Status code: 200
"
1695234480709,"[ERROR] TypeError: '>=' not supported between instances of 'WaiterError' and 'int'
Traceback (most recent call last):
File ""/var/task/index.py"", line 104, in lambda_handler
responseData = {'Error': traceback.format_exc(e)}
File ""/var/lang/lib/python3.9/traceback.py"", line 167, in format_exc
return """".join(format_exception(*sys.exc_info(), limit=limit, chain=chain))
File ""/var/lang/lib/python3.9/traceback.py"", line 120, in format_exception
return list(TracebackException(
File ""/var/lang/lib/python3.9/traceback.py"", line 517, in __init__
self.stack = StackSummary.extract(
File ""/var/lang/lib/python3.9/traceback.py"", line 340, in extract
if limit >= 0:"
1695234480711,"END RequestId: a61cd434-47db-4fe3-9030-233546c48bbc
"
1695234480711,"REPORT RequestId: a61cd434-47db-4fe3-9030-233546c48bbc Duration: 22847.61 ms Billed Duration: 22848 ms Memory Size: 256 MB Max Memory Used: 90 MB Init Duration: 259.10 ms
"
1695234486042,"START RequestId: c2b6028a-0ecd-4f66-9dd7-3b8a32fab0d4 Version: $LATEST
"
1695234486045,"dict_values(['Delete', 'arn:aws:lambda:us-west-2:024309034029:function:eks-workshop-ide-EksWorkshopC9BootstrapInstanceLam-aofnILpRQOED', 'https://cloudformation-custom-resource-response-uswest2.s3-us-west-2.amazonaws.com/arn%3Aaws%3Acloudformation%3Aus-west-2%3A024309034029%3Astack/eks-workshop-ide/ff9cb6f0-57e2-11ee-9865-029e3d5a0f25%7CEksWorkshopC9BootstrapInstanceLambda%7C00c84ef6-9da9-439f-8f0b-1076602343e0?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230920T182805Z&X-Amz-SignedHeaders=host&X-Amz-Expires=7199&X-Amz-Credential=AKIA54RCMT6SBGALGB7S%2F20230920%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=cded4aa7689af4edee768a4d3254c02fbac9e7b213fbbd1e07f7e356094582a6', 'arn:aws:cloudformation:us-west-2:024309034029:stack/eks-workshop-ide/ff9cb6f0-57e2-11ee-9865-029e3d5a0f25', '00c84ef6-9da9-439f-8f0b-1076602343e0', 'EksWorkshopC9BootstrapInstanceLambda', 'CustomResourcePhysicalID', 'Custom::EksWorkshopC9BootstrapInstanceLambda', {'ServiceToken': 'arn:aws:lambda:us-west-2:024309034029:function:eks-workshop-ide-EksWorkshopC9BootstrapInstanceLam-aofnILpRQOED', 'Cloud9Name': 'eks-workshop-ide-opensearch', 'LabIdeInstanceProfileName': 'eks-workshop-ide-EksWorkshopC9InstanceProfile-FgANiviqchtH', 'LabIdeInstanceProfileArn': 'arn:aws:iam::024309034029:instance-profile/eks-workshop-ide-EksWorkshopC9InstanceProfile-FgANiviqchtH', 'EnvironmentId': '1a7009aa3b3d4535aa6e95458e2bae25', 'SsmDocument': 'eks-workshop-ide-EksWorkshopC9SSMDocument-WHLm7hseBpum', 'REGION': 'us-west-2'}])
"
1695234486045,"https://cloudformation-custom-resource-response-uswest2.s3-us-west-2.amazonaws.com/arn%3Aaws%3Acloudformation%3Aus-west-2%3A024309034029%3Astack/eks-workshop-ide/ff9cb6f0-57e2-11ee-9865-029e3d5a0f25%7CEksWorkshopC9BootstrapInstanceLambda%7C00c84ef6-9da9-439f-8f0b-1076602343e0?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230920T182805Z&X-Amz-SignedHeaders=host&X-Amz-Expires=7199&X-Amz-Credential=AKIA54RCMT6SBGALGB7S%2F20230920%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=cded4aa7689af4edee768a4d3254c02fbac9e7b213fbbd1e07f7e356094582a6
"
1695234486045,"Response body:
"
1695234486045,"{""Status"": ""SUCCESS"", ""Reason"": ""See the details in CloudWatch Log Stream: 2023/09/20/[$LATEST]2a12197b296c4fa3aeecd16fe6ebff4f"", ""PhysicalResourceId"": ""CustomResourcePhysicalID"", ""StackId"": ""arn:aws:cloudformation:us-west-2:024309034029:stack/eks-workshop-ide/ff9cb6f0-57e2-11ee-9865-029e3d5a0f25"", ""RequestId"": ""00c84ef6-9da9-439f-8f0b-1076602343e0"", ""LogicalResourceId"": ""EksWorkshopC9BootstrapInstanceLambda"", ""NoEcho"": false, ""Data"": {""Success"": ""Custom Resource removed""}}
"
1695234486097,"Status code: 200
"
1695234486098,"END RequestId: c2b6028a-0ecd-4f66-9dd7-3b8a32fab0d4
"
1695234486098,"REPORT RequestId: c2b6028a-0ecd-4f66-9dd7-3b8a32fab0d4 Duration: 56.39 ms Billed Duration: 57 ms Memory Size: 256 MB Max Memory Used: 91 MB
"
I faced the same issue as above and here is what I noticed:
Looking on the SSM Document execution history, I can see the first execution of the SSM Document failed to extend the volume:
NOCHANGE: partition 1 is size 20967391. it cannot be grown
failed to run commands: exit status 1
However, the second execution succeed:
CHANGED: partition=1 start=4096 old: size=20967391 end=20971487 new: size=62910431 end=62914527
meta-data=/dev/nvme0n1p1 isize=512 agcount=6, agsize=524159 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=0, rmapbt=0
= reflink=0 bigtime=0 inobtcount=0
data = bsize=4096 blocks=2620923, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
data blocks changed from 2620923 to 7863803
Package 1:findutils-4.5.11-6.amzn2.x86_64 already installed and latest version
Package 2:tar-1.26-35.amzn2.0.2.x86_64 already installed and latest version
Package gzip-1.5-10.amzn2.0.1.x86_64 already installed and latest version
Package git-2.40.1-1.amzn2.0.1.x86_64 already installed and latest version
Package diffutils-3.3-5.amzn2.x86_64 already installed and latest version
Package wget-1.14-18.amzn2.1.x86_64 already installed and latest version
Package unzip-6.0-57.amzn2.0.1.x86_64 already installed and latest version
Package 1:openssl-1.0.2k-24.amzn2.0.9.x86_64 already installed and latest version
Package gettext-0.19.8.1-3.amzn2.x86_64 already installed and latest version
Package 1:bash-completion-2.1-6.amzn2.noarch already installed and latest version
Package python3-3.7.16-1.amzn2.0.4.x86_64 already installed and latest version
Package python3-pip-20.2.2-1.amzn2.0.4.noarch already installed and latest version
Package amazon-linux-extras-2.0.1-1.amzn2.noarch already installed and latest version
Collecting awscurl
Downloading awscurl-0.29-py3-none-any.whl (8.9 kB)
Collecting configargparse
Downloading ConfigArgParse-1.7-py3-none-any.whl (25 kB)
Requirement already satisfied: urllib3[secure] in /usr/local/lib/python3.7/site-packages (from awscurl) (1.26.16)
Collecting requests
Downloading requests-2.31.0-py3-none-any.whl (62 kB)
Collecting configparser
Downloading configparser-5.3.0-py3-none-any.whl (19 kB)
Collecting cryptography>=1.3.4; extra == "secure"
Downloading cryptography-41.0.4-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.4 MB)
Collecting pyOpenSSL>=0.14; extra == "secure"
Downloading pyOpenSSL-23.2.0-py3-none-any.whl (59 kB)
Collecting idna>=2.0.0; extra == "secure"
Downloading idna-3.4-py3-none-any.whl (61 kB)
Collecting certifi; extra == "secure"
Downloading certifi-2023.7.22-py3-none-any.whl (158 kB)
Collecting urllib3-secure-extra; extra == "secure"
Downloading urllib3_secure_extra-0.1.0-py2.py3-none-any.whl (1.4 kB)
Collecting charset-normalizer<4,>=2
Downloading charset_normalizer-3.2.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (175 kB)
Collecting cffi>=1.12
Downloading cffi-1.15.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (427 kB)
Collecting pycparser
Downloading pycparser-2.21-py2.py3-none-any.whl (118 kB)
Installing collected packages: configargparse, certifi, charset-normalizer, idna, requests, configparser, awscurl, pycparser, cffi, cryptography, pyOpenSSL, urllib3-secure-extra
Successfully installed awscurl-0.29 certifi-2023.7.22 cffi-1.15.1 charset-normalizer-3.2.0 configargparse-1.7 configparser-5.3.0 cryptography-41.0.4 idna-3.4 pyOpenSSL-23.2.0 pycparser-2.21 requests-2.31.0 urllib3-secure-extra-0.1.0
kubectl: OK
helm.tar.gz: OK
eksctl_Linux_amd64.tar.gz: OK
kustomize.tar.gz: OK
You can now run: /usr/local/bin/aws --version
kubeseal.tar.gz: OK
yq: OK
flux.tar.gz: OK
terraform.zip: OK
Archive: terraform.zip
inflating: terraform
argocd-linux-amd64: OK
Looking on the Cloudwatch logs for the Lambda Function, I can see the following error:
Traceback (most recent call last):
--
File "/var/task/index.py", line 89, in lambda_handler
waiter.wait(
File "/var/runtime/botocore/waiter.py", line 55, in wait
Waiter.wait(self, **kwargs)
File "/var/runtime/botocore/waiter.py", line 375, in wait
raise WaiterError(
botocore.exceptions.WaiterError: Waiter CommandExecuted failed: Waiter encountered a terminal failure state: For expression "Status" we matched expected path: "Failed"
https://cloudformation-custom-resource-response-apsoutheast2.s3-ap-southeast-2.amazonaws.com/arn%3Aaws%3Acloudformation%3Aap-southeast-2%3A915161342166%3Astack/eks-workshop-ide/f2fb4860-5da7-11ee-a9aa-0ad479ff0b09%7CEksWorkshopC9BootstrapInstanceLambda%7C5fa5fba5-846b-4eff-be7e-9e1d35966ca2?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230928T024000Z&X-Amz-SignedHeaders=host&X-Amz-Expires=7200&X-Amz-Credential=AKIA6MM33IIZ4UIQASPA%2F20230928%2Fap-southeast-2%2Fs3%2Faws4_request&X-Amz-Signature=6e7426f05504751c46803491f7428ba1d387b573e935e4c2be330488ab9334a8
Looking on the Lambda Code, I can see that if the waiter failed to receive success response, it will go to the exception and mark the cloudformation stack as failed:
waiter = ssm.get_waiter('command_executed')
waiter.wait(
CommandId=command_id,
InstanceId=instance_id,
WaiterConfig={
'Delay': 10,
'MaxAttempts': 30
}
)
responseData = {'Success': 'Started bootstrapping for instance: '+instance_id}
cfnresponse.send(event, context, status, responseData, 'CustomResourcePhysicalID')
except Exception as e:
status = cfnresponse.FAILED
print(traceback.format_exc())
responseData = {'Error': traceback.format_exc(e)}
finally:
cfnresponse.send(event, context, status, responseData, 'CustomResourcePhysicalID')
I am trying to investigate why the first SSM execution failed to extend the volume, while the second exection succeed. The other option is to change the Lambda Code to cater to a scenario where subsequent SSM executions can still succeed even though the previous execution failed.
Installation method
Own AWS account
What happened?
Going through the workshop setup in my own account: https://www.eksworkshop.com/docs/introduction/setup/your-account and eks-workshop-ide-cfn.yaml keeps failing with the following errors:
Cannot create the AWS Cloud9 environment. There was a problem connecting to the environment.
What did you expect to happen?
Template to execute successfully
How can we reproduce it?
Launch the template from here in your own account:
https://www.eksworkshop.com/docs/introduction/setup/your-account
Anything else we need to know?
No response
EKS version
This has nothing to do with the EKS cluster.