aws-cloudformation / cloudformation-cli-python-plugin

The CloudFormation Provider Development Toolkit Python Plugin allows you to autogenerate Python code based on an input schema.
Apache License 2.0
108 stars 47 forks source link

Unable to submit resources because docker doesn't return logs #268

Open Giaco9 opened 9 months ago

Giaco9 commented 9 months ago

Hi, We have an issue running the cfn-cli in our CI/CD system (based on Jenkins) to build and deploy a new version of our resources. The issue is happening while updating an existing resource. Still, I don't think this makes any difference because the issue is related to the logs of the first container used to install the dependencies to build the resource.

Below, I'm attaching all the information I have to help debug the issue in different files.

Instance setup: instance-setup.txt

The command we run:

/usr/local/bin/cfn submit --region us-west-2 -v -v -v --set-default --use-docker

The error:

Build running. Output:
Unhandled exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/rpdk/core/cli.py", line 105, in main
    args.command(args)
  File "/usr/local/lib/python3.7/site-packages/rpdk/core/submit.py", line 26, in submit
    args.profile,
  File "/usr/local/lib/python3.7/site-packages/rpdk/core/project.py", line 652, in submit
    self._add_resources_content_to_zip(zip_file)
  File "/usr/local/lib/python3.7/site-packages/rpdk/core/project.py", line 686, in _add_resources_content_to_zip
    self._plugin.package(self, zip_file)
  File "/usr/local/lib/python3.7/site-packages/rpdk/python/codegen.py", line 291, in package
    self._build(project.root)
  File "/usr/local/lib/python3.7/site-packages/rpdk/python/codegen.py", line 308, in _build
    self._docker_build(base_path)
  File "/usr/local/lib/python3.7/site-packages/rpdk/python/codegen.py", line 392, in _docker_build
    for line in logs:
TypeError: 'NoneType' object is not iterable
=== Unhandled exception ===
Please report this issue to the team.
Issue tracker: github.com/aws-cloudformation/cloudformation-cli/issues
Please include the log file 'rpdk.log'

The full output rpdk.log

I tried to understand what was happening, and I saw that the submit command started and the first container to install the dependencies ran successfully. I collected the following logs from that container:

Fetch the logs of a container
Collecting cloudformation-cli-python-lib==2.1.5
  Downloading cloudformation_cli_python_lib-2.1.5-py3-none-any.whl (18 kB)
Collecting boto3>=1.10.20
  Downloading boto3-1.33.13-py3-none-any.whl (139 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.3/139.3 kB 8.4 MB/s eta 0:00:00
Collecting jmespath<2.0.0,>=0.7.1
  Downloading jmespath-1.0.1-py3-none-any.whl (20 kB)
Collecting botocore<1.34.0,>=1.33.13
  Downloading botocore-1.33.13-py3-none-any.whl (11.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 191.1 MB/s eta 0:00:00
Collecting s3transfer<0.9.0,>=0.8.2
  Downloading s3transfer-0.8.2-py3-none-any.whl (82 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 82.0/82.0 kB 174.0 MB/s eta 0:00:00
Collecting python-dateutil<3.0.0,>=2.1
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 kB 175.7 MB/s eta 0:00:00
Collecting urllib3<1.27,>=1.25.4
  Downloading urllib3-1.26.18-py2.py3-none-any.whl (143 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 143.8/143.8 kB 238.2 MB/s eta 0:00:00
Collecting six>=1.5
  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: urllib3, six, jmespath, python-dateutil, botocore, s3transfer, boto3, cloudformation-cli-python-lib
Successfully installed boto3-1.33.13 botocore-1.33.13 cloudformation-cli-python-lib-2.1.5 jmespath-1.0.1 python-dateutil-2.8.2 s3transfer-0.8.2 six-1.16.0 urllib3-1.26.18

The container exists with code 0, but the pipeline doesn't show the container logs because it fails running at line being the variable logs equal to None. Because of that, the whole procedure fails with the output above.

Right before the docker daemon removed the container, I was able to inspect the container two times (the first time while it was running, the second one when the container stopped)

Container running: container-running.txt

Container removing: container-removing.txt

The resource definition: resource.json

Thank you for the help

Scribbd commented 3 weeks ago

I can confirm this is also the case for GitLab CI.

When using docker:dind as a service in GitLab CI also masks logs and has this script waiting indefinitely for an output that has already been created.