GoogleContainerTools / kaniko

Build Container Images In Kubernetes
Apache License 2.0
14.61k stars 1.42k forks source link

Regression regarding USER in 1.9.2 #2533

Open markusheiden opened 1 year ago

markusheiden commented 1 year ago

Actual behavior When migrating from 1.9.1 to 1.9.2 (gcr.io/kaniko-project/executor:v1.9.2-debug) I get issues with USER:

error building image: error building stage: failed to execute command: identifying uid and gid for user cloudsdk: user cloudsdk is not a uid and does not exist on the system

The issue happens with 1.10.0 too.

Expected behavior No error.

To Reproduce Build this Dockerfile with Kaniko:

FROM google/cloud-sdk:432.0.0
USER cloudsdk
WORKDIR /home/cloudsdk/
Triage Notes for the Maintainers Description Yes/No
Please check if this a new feature you are proposing
  • - [ ]
Please check if the build works in docker but not in kaniko
  • - [x]
Please check if this error is seen when you use --cache flag
  • - [x]
Please check if your dockerfile is a multistage dockerfile
  • - [ ]

Without the --cache option, the error still happens.

Full log:

Using docker image sha256:2a262cb26807e11c0cf9ab7671d59767c30bc382b5835f29cf5c03a5e2827f30 for gcr.io/kaniko-project/executor:v1.9.2-debug with digest gcr.io/kaniko-project/executor@sha256:964426c9205d644e2964869d1d311a05dc9f301594300d3732ea26b5733e94fc ...
$ /kaniko/executor --log-format=color --log-timestamp=true --registry-mirror=mirror.gcr.io --context ${CI_PROJECT_DIR} --dockerfile ${CI_PROJECT_DIR}/Dockerfile --no-push
INFO[20[23](https://gitlab.com/adsoul/engineering/docker-images/kubernetes-toolbox/-/jobs/4364508503#L23)-05-28T19:11:21Z] Using dockerignore file: /builds/adsoul/engineering/docker-images/kubernetes-toolbox/.dockerignore 
INFO[2023-05-28T19:11:21Z] Retrieving image manifest google/cloud-sdk:432.0.0 
INFO[2023-05-28T19:11:21Z] Retrieving image google/cloud-sdk:432.0.0 from registry mirror mirror.gcr.io 
WARN[2023-05-28T19:11:21Z] Failed to retrieve image google/cloud-sdk:432.0.0 from registry mirror mirror.gcr.io: GET https://mirror.gcr.io/v2/google/cloud-sdk/manifests/432.0.0: MANIFEST_UNKNOWN: Failed to fetch "432.0.0" from request "/v2/google/cloud-sdk/manifests/432.0.0".. Will try with the next mirror, or fallback to the default registry. 
INFO[2023-05-28T19:11:21Z] Retrieving image google/cloud-sdk:432.0.0 from registry index.docker.io 
INFO[2023-05-28T19:11:21Z] Built cross stage deps: map[]                
INFO[2023-05-28T19:11:21Z] Retrieving image manifest google/cloud-sdk:432.0.0 
INFO[2023-05-[28](https://gitlab.com/adsoul/engineering/docker-images/kubernetes-toolbox/-/jobs/4364508503#L28)T19:11:21Z] Returning cached image manifest              
INFO[2023-05-28T19:11:21Z] Executing 0 build triggers                   
INFO[2023-05-28T19:11:21Z] Building stage 'google/cloud-sdk:432.0.0' [idx: '0', base-idx: '-1'] 
INFO[2023-05-28T19:11:21Z] Skipping unpacking as no commands require it. 
INFO[2023-05-28T19:11:21Z] USER cloudsdk                                
INFO[2023-05-28T19:11:21Z] Cmd: USER                                    
INFO[2023-05-28T19:11:21Z] WORKDIR /home/cloudsdk/                      
INFO[2023-05-28T19:11:21Z] Cmd: workdir                                 
INFO[2023-05-28T19:11:21Z] Changed working directory to /home/cloudsdk/ 
error building image: error building stage: failed to execute command: identifying uid and gid for user cloudsdk: user cloudsdk is not a uid and does not exist on the system
ltatakis-optaxe commented 1 year ago

We have a similar issue when using "gcr.io/kaniko-project/executor:debug" for

FROM summerwind/actions-runner:ubuntu-22.04

USER runner
WORKDIR /usr/app

RUN curl -sL https://deb.nodesource.com/setup_18.x | sudo bash - 
RUN sudo apt-get install -y nodejs && sudo npm install --global yarn

with the error

gcr.io/kaniko-project/executor:debug
�[36mINFO�[0m[0001] Retrieving image manifest summerwind/actions-runner:ubuntu-22.04 
�[36mINFO�[0m[0001] Retrieving image summerwind/actions-runner:ubuntu-22.04 from registry index.docker.io 
�[36mINFO�[0m[0002] Retrieving image manifest summerwind/actions-runner:ubuntu-22.04 
�[36mINFO�[0m[0002] Returning cached image manifest              
�[36mINFO�[0m[0002] Built cross stage deps: map[]                
�[36mINFO�[0m[0002] Retrieving image manifest summerwind/actions-runner:ubuntu-22.04 
�[36mINFO�[0m[0002] Returning cached image manifest              
�[36mINFO�[0m[0002] Retrieving image manifest summerwind/actions-runner:ubuntu-22.04 
�[36mINFO�[0m[0002] Returning cached image manifest              
�[36mINFO�[0m[0002] Executing 0 build triggers                   
�[36mINFO�[0m[0002] Building stage 'summerwind/actions-runner:ubuntu-22.04' [idx: '0', base-idx: '-1'] 
�[36mINFO�[0m[0002] Cmd: USER                                    
�[36mINFO�[0m[0002] Checking for cached layer europe-west2-docker.pkg.dev/repo/company-docker/company-actions-runner/cache:705e862785460a518a3b63f325e3126740b12b7..... 
�[36mINFO�[0m[0004] Using caching version of cmd: RUN curl -sL https://deb.nodesource.com/setup_18.x | sudo bash - 
�[36mINFO�[0m[0004] Checking for cached layer europe-west2-docker.pkg.devrepo/company-docker/company-actions-runner/cache:230c5b845fbb72af1b1222ba711ea2ab5c...... 
�[36mINFO�[0m[0006] Using caching version of cmd: RUN sudo apt-get install -y nodejs && sudo npm install --global yarn 
�[36mINFO�[0m[0006] Skipping unpacking as no commands require it. 
�[36mINFO�[0m[0006] USER runner                                  
�[36mINFO�[0m[0006] Cmd: USER                                    
�[36mINFO�[0m[0006] No files changed in this command, skipping snapshotting. 
�[36mINFO�[0m[0006] WORKDIR /usr/app                             
�[36mINFO�[0m[0006] Cmd: workdir                                 
�[36mINFO�[0m[0006] Changed working directory to /usr/app        
error building image: error building stage: failed to execute command: identifying uid and gid for user runner: user runner is not a uid and does not exist on the system
ERROR
ERROR: build step 0 "gcr.io/kaniko-project/executor:debug" failed: step exited with non-zero status: 1
BronzeDeer commented 1 year ago

might be related to #2384 and #2440 which bumped the go-containerregistry dependency by multiple minor versions between 1.9.1 and 1.9.2.

markusheiden commented 1 year ago

I tested with 1.13.0 today and the problem seems to be fixed.

aaron-prindle commented 1 year ago

Thanks for the update @markusheiden! Going to close this as it appears to be fixed now

Jitsusama commented 1 year ago

@aaron-prindle; I've just hit this same issue in 1.13.0. It seems not to happen when there is no cache available, but happens when cache is available. I've hit this a few times. Downgrading to 1.9.1 fixes the issue. I think this should be re-opened.

bizrad commented 11 months ago

I'm seeing this issue as well with 1.16.0, here are my steps to reproduce. Without WORKDIR /app the build works. This same build works with docker. These same steps with v1.9.1-debug result in a successful build.

$ docker run -it --rm --entrypoint sh gcr.io/kaniko-project/executor:v1.16.0-debug
/workspace # touch foo
/workspace # echo -e "FROM ubuntu:latest\nUSER test\nWORKDIR /app\nCOPY foo ." >Dockerfile
/workspace # /kaniko/executor --context $PWD --dockerfile Dockerfile --no-push
INFO[0000] Retrieving image manifest ubuntu:latest
INFO[0000] Retrieving image ubuntu:latest from registry index.docker.io
INFO[0000] Built cross stage deps: map[]
INFO[0000] Retrieving image manifest ubuntu:latest
INFO[0000] Returning cached image manifest
INFO[0000] Executing 0 build triggers
INFO[0000] Building stage 'ubuntu:latest' [idx: '0', base-idx: '-1']
INFO[0000] Unpacking rootfs as cmd COPY foo . requires it.
INFO[0002] USER test
INFO[0002] Cmd: USER
INFO[0002] WORKDIR /app
INFO[0002] Cmd: workdir
INFO[0002] Changed working directory to /app
error building image: error building stage: failed to execute command: identifying uid and gid for user test: user test is not a uid and does not exist on the system
Nicolas-Peiffer commented 8 months ago

I experience the same bug on Kaniko 1.19.2:

Kaniko's version I use: executor: gcr.io/kaniko-project/executor:v1.19.2.

The Kaniko error logs from the Github Actions dashboard looks like this:

INFO[0087] WORKDIR ${APP_DIR}                           
INFO[0087] Cmd: workdir                                 
INFO[0087] Changed working directory to /home/poetryuser/falco-mitre-checker 
DEBU[0087] Fetching uid and gid for USER 'poetryuser'   
error building image: error building stage: failed to execute command: identifying uid and gid for user poetryuser: user poetryuser is not a uid and does not exist on the system
##[error]Error: The process '/usr/bin/docker' failed with exit code 1

Link to my CI pipeline on Github: https://github.com/ThalesGroup/rules/actions/runs/7557411837/job/20576570525

My Containerfile looks like this:

[...]
# create a non-root user since this app does not need root privileges
RUN addgroup \
    --gid ${APP_GROUP_GID} \
    ${APP_GROUP} \
 && adduser \
    --uid ${APP_USER_UID} \
    --gid ${APP_GROUP_GID} \
    --shell /bin/bash \
    --disabled-login \
    --disabled-password \
    --gecos "User for poetry app" \
    ${APP_USER}

[...]

# PROJECT_DIR and APP_DIR are defined above
ARG APP_DIR
ARG APP_USER

# use non-root user
USER ${APP_USER}

WORKDIR ${APP_DIR}

ENTRYPOINT [ "python", "-m", "falco_mitre_checker"]
[...]

It is worth mentioning that when I build the same Containerfile with podman build, I have no problem: the container build is a success and the container image works fine.

dfkunstler commented 3 months ago

Is there any workaround or ECD on this issue? As it stands, I'm stuck using v1.9.1.

CptnFizzbin commented 3 months ago

Also ran into this bug, but we're unable to downgrade to 1.9.1.

Can confirm it's related to building the container while a Cache is available. Disabling the cache allows the build to run.

kalihman commented 3 weeks ago

Running into same issue. Any ETA for the fix? Any workaround?