Closed wesbragagt closed 3 months ago
I saw this issue in the buildkit repo https://github.com/moby/buildkit/issues/3947
Additionally from the buildkit server pod from a different build that I attempted: msg="/moby.buildkit.v1.Control/Solve ret │ │ urned error: rpc error: code = Unknown desc = failed to push 730335560480.dkr.ecr.us-west- │ │ 2.amazonaws.com/reconciliation-engine:ac488ec3015f6e6a73dc8e4a3dc547c760d4078e-amd64: unex │ │ pected status from HEAD request to https://730335560480.dkr.ecr.us-west-2.amazonaws.com/v2 │ │ /reconciliation-engine/blobs/sha256:83989655d54c90de3d819ce6788e7864338ea3bbaf3d825dbbbd8f │ │ 0059745a5d: 401 Unauthorized
@fullykubed could the service account associated with the iam role be bound to a role that doesn't exist anymore upon destroy and apply of the same stack? That's the only thing I can think that could be causing this since I didn't have this issue before on first time applying the CICD stack. I'm gonna try commenting out the login logic and see if I get a different error at least confirm that buildctl is indeed using the /.docker/config.json file and verify the sts caller identity.
After further investigation I've confirmed that the sts caller-identity is indeed as expected to be the iam role created. I've also confirmed that the role and serviceaccount get deleted upon destruction of the workflow. I also tried in a brand new ECR repo using the panfactum stack aws_ecr_repos and still got a 401. I'm extending the wf_dockerfile_build in order to debug the script/build.sh.
Found this issue https://github.com/moby/buildkit/issues/2136 which led me to tweak the script as below, but no success 😞 still getting
build-implentio-api-2gsqp-build-images-3289723871: #22 exporting config sha256:199e2eb85164379735b791d2b85f51b3455d619c089252c80c04090d073e16d8 done
build-implentio-api-2gsqp-build-images-3289723871: #22 pushing layers 0.0s done
build-implentio-api-2gsqp-build-images-3289723871: #22 ERROR: failed to push 730335560480.dkr.ecr.us-west-2.amazonaws.com/api:2a4361de4e5a813f44ad1c882078fedbde196714-arm64: unexpected status from HEAD request to https://730335560480.dkr.ecr.us-west-2.amazonaws.com/v2/api/blobs/sha256:f4691ee707061ff9d48b30141bf04f1b16129de954e7c2c12b382f696d437224: 401 Unauthorized
build-implentio-api-2gsqp-build-images-3289723871: ------
build-implentio-api-2gsqp-build-images-3289723871: > exporting to image:
build-implentio-api-2gsqp-build-images-3289723871: ------
build-implentio-api-2gsqp-build-images-3289723871: error: failed to solve: failed to push 730335560480.dkr.ecr.us-west-2.amazonaws.com/api:2a4361de4e5a813f44ad1c882078fedbde196714-arm64: unexpected status from HEAD request to https://730335560480.dkr.ecr.us-west-2.amazonaws.com/v2/api/blobs/sha256:f4691ee707061ff9d48b30141bf04f1b16129de954e7c2c12b382f696d437224: 401 Unauthorized
# /scripts/build.sh
#!/usr/bin/env bash
set -eo pipefail
###########################################################
## Step 1: CD to the codebase
###########################################################
cd /code/repo || exit
###########################################################
## Step 2: Set the image tag as the commit sha
###########################################################
TAG="$(git rev-parse "$GIT_REF")-$ARCH"
###########################################################
## Step 3: Get BuildKit address
###########################################################
BUILDKIT_HOST=$(pf-buildkit-get-address --arch="$ARCH")
export BUILDKIT_HOST
###########################################################
## Step 4: Get the ECR credentials
###########################################################
# DEBUG AWS permissions
aws sts get-caller-identity
printenv
echo "Generating docker config"
ECR_PASSWORD=$(aws ecr get-login-password --region "$IMAGE_REGION")
AUTH_TOKEN=$(echo "AWS:$ECR_PASSWORD" | base64 --wrap=0)
DOCKER_HUB_USER_BASE64=$(echo "$DOCKER_HUB_USER" | base64 --wrap=0)
DOCKER_HUB_PASSWORD_BASE64=$(echo "$DOCKER_HUB_PASSWORD" | base64 --wrap=0)
cat >"/.docker/config.json" <<EOF
{
"auths": {
"$IMAGE_REGISTRY": {
"auth": "$AUTH_TOKEN"
},
"$DOCKER_HUB_REGISTRY": {
"auth": "$DOCKER_HUB_USER_BASE64:$DOCKER_HUB_PASSWORD_BASE64"
}
}
}
EOF
###########################################################
## Step 5: Record the build
###########################################################
pf-buildkit-record-build --arch="$ARCH"
###########################################################
## Step 6: Build the image
###########################################################
# shellcheck disable=SC2086
buildctl \
build \
--frontend=dockerfile.v0 \
--output "type=image,name=$IMAGE_REGISTRY/$IMAGE_REPO:$TAG,push=$PUSH_IMAGE" \
--local context="$BUILD_CONTEXT" \
--local dockerfile="$(dirname "$DOCKERFILE_PATH")" \
--opt filename="./$(basename "$DOCKERFILE_PATH")" \
$SECRET_ARGS \
$BUILD_ARGS \
--export-cache "type=s3,region=$BUILDKIT_BUCKET_REGION,bucket=$BUILDKIT_BUCKET_NAME,name=$IMAGE_REGISTRY/$IMAGE_REPO" \
--import-cache "type=s3,region=$BUILDKIT_BUCKET_REGION,bucket=$BUILDKIT_BUCKET_NAME,name=$IMAGE_REGISTRY/$IMAGE_REPO" || tail -f /dev/null # allow shell into container for further debugging
My issue was solved by passing DOCKER_CONFIG=/.docker instead DOCKER_CONFIG=/.docker/config.json. Small oversight on my end. I hope it helps someone else.
@wesbragagt A little confused on what ended up being the problem and solution here.
Primarily just making sure there isn't an issue with the module> Are you saying here that the module did not work out of the box and that you had to override something? Or are you saying that you overrode a default value and that caused the issue?
@fullykubed the wf_dockerfile_build did not work out of the box for me so I had to copy and extend it to add the DOCKER_CONFIG env variable pointing to the /.docker folder.
@wesbragagt Gotchya. I believe the version in the latest release does work without the need for any modification.
Can you check?
@fullykubed I haven't gotten to that push step but I created a bug issue related to another issue I had to solve by extending the module https://github.com/Panfactum/stack/issues/106.
Prior Search
What is your question?
When running my workloads in argo to build and push a container image using the stack module wf_dockerfile_build/ I keep getting an unauthorized error 401 when trying to push to the container registry for which the workflow is running. Where else could I look at to further resolve the issue?
Steps I've attempted without success:
Logs:
What primary components of the stack does this relate to?
terraform, reference
Code of Conduct