CircleCI-Public / aws-ecr-orb

CircleCI orb for interacting with Amazon's Elastic Container Registry (ECR)
https://circleci.com/orbs/registry/orb/circleci/aws-ecr
MIT License
78 stars 140 forks source link

memory (or CPU) problem on aws-ecr job #281

Closed blagae closed 1 year ago

blagae commented 1 year ago

Orb version

8.2.1

What happened

When I try to run certain types of builds with the AWS-ECR orb, I recently started getting warnings that build jobs cannot execute because of memory and/or threading problems. This seems to be irrespective of the resource class of the executor (which I tried setting to large).

Below is a build failure for a new project that has a very simple Dockerfile:

FROM quay.io/keycloak/keycloak:21.1.1 AS common
ENV KC_DB=postgres
ENV KC_HOSTNAME=localhost

FROM common AS builder
ENV KC_HEALTH_ENABLED=true
ENV KC_METRICS_ENABLED=true
WORKDIR /opt/keycloak
RUN keytool -genkeypair -storepass password -storetype PKCS12 -keyalg RSA -keysize 2048 -dname "CN=${KC_HOSTNAME}" -alias server -ext "SAN:c=DNS:localhost,IP:127.0.0.1" -keystore conf/server.keystore

Below is the error, which mentions that the process can't create a thread, but also that it ran out of memory.

#6 [builder 2/3] RUN keytool -genkeypair -storepass password -storetype PKCS12 -keyalg RSA -keysize 2048 -dname "CN=localhost" -alias server -ext "SAN:c=DNS:localhost,IP:127.0.0.1" -keystore conf/server.keystore
#6 sha256:3ec203985ac42e2a912b42bd75ece3acea2f4a5b4db666aedba7d429db11ed56
#6 0.252 [0.002s][warning][os,thread] Failed to start thread "GC Thread#0" - pthread_create failed (EPERM) for attributes: stacksize: 1024k, guardsize: 4k, detached.
#6 0.252 #
#6 0.252 # There is insufficient memory for the Java Runtime Environment to continue.
#6 0.252 # Cannot create worker GC thread. Out of system resources.
#6 0.252 # An error report file with more information is saved as:
#6 0.252 # /opt/keycloak/hs_err_pid1.log

and on another project (which used to run well), we get a pip failure to start a thread for no apparent reason:

#8 [4/5] RUN pip install -r requirements.txt
#8 <OMITTING LARGE STACK TRACE>
#8 1.583   File "/usr/local/lib/python3.10/threading.py", line 935, in start
#8 1.583     _start_new_thread(self._bootstrap, ())
#8 1.583 RuntimeError: can't start new thread
#8 1.675 
#8 1.675 [notice] A new release of pip is available: 23.0.1 -> 23.1.2
#8 1.675 [notice] To update, run: pip install --upgrade pip
#8 ERROR: executor failed running [/bin/sh -c pip install -r requirements.txt]: exit code: 2

My CircleCI config file is pretty simple, so I don't think there's anything wrong with it:

workflows:
  full:
    jobs:
      - aws-ecr/build-and-push-image:
          name: << pipeline.parameters.image-name >>-build-and-push
          context: AWS
          create-repo: true
          executor:
            name: aws-ecr/default
            # I tried setting a custom resource class, a different image, etc, to no avail
          repo: << pipeline.parameters.image-name >>
          extra-build-args: --memory=4G --cpuset-cpus=2 # these are also some attempts that didn't have any effect

It seems to me, but I am definitely not sure, that this started occurring on builds after the recent changes that CircleCI made to the execution environments. The Python project definitely used to build correctly in early June.

FYI I am also submitting this to the official CircleCI support channel, because I don't know if this is a them problem or a you problem.

Expected behavior

builds should succeed

blagae commented 1 year ago

A very helpful support person at CircleCI asked me to try defining the default executor with a newer image:

       - aws-ecr/build-and-push-image:
          executor: 
              name: aws-ecr/default
              image: "ubuntu-2004:2023.04.2"

This turned out to work for both projects (Java and Python). I then looked this repo and saw that PR #270 exists and has been merged in the master branch, which fixes another problem in roughly the same way. I also tried to manually define the image from this PR, which is ubuntu-2004:2022.04.1, and the builds succeed as well. That means that my problem has been fixed already.

As a result, I would like to ask the maintainers of this repo to make a new release of this orb. In the mean time, I am no longer blocked due to the workaround above.

brivu commented 1 year ago

Hey @blagae,

Thanks for opening this issue and providing all of your details. We're working on closing out all these issues and will cut a new release for this orb very soon.

Thanks again! Brian

TaxBusby commented 1 year ago

This issue appears to still be affecting users. We are hitting this issue as well. I would recommend leaving the issue "Open" so it's easier for others to find this workaround.

We are also on the latest orb version 8.2.1 and were using aws-ecr/default as the executor. Our pip install inside a dockerfile build was giving us:

RuntimeError: can't start new thread

We applied the workaround described by @blagae here and it has fixed the issue for us.