GoogleContainerTools / kaniko

Build Container Images In Kubernetes
Apache License 2.0
14.61k stars 1.42k forks source link

error building image: error building stage: lstat /tmp/apt-key-gpghome.VoPBz66R2g/gnupg_spawn_agent_sentinel.lock: no such file or directory #769

Open olivierboudet opened 5 years ago

olivierboudet commented 5 years ago

Actual behavior I am trying to build a Dockerfile which is built correctly with docker daemon. I have this error :

Setting up google-chrome-stable (77.0.3865.75-1) ...
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/x-www-browser (x-www-browser) in auto mode
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/gnome-www-browser (gnome-www-browser) in auto mode
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/google-chrome (google-chrome) in auto mode
Processing triggers for libc-bin (2.24-11+deb9u4) ...
Processing triggers for libgdk-pixbuf2.0-0:amd64 (2.36.5-2+deb9u2) ...
INFO[0278] Taking snapshot of full filesystem...        
INFO[0279] Adding whiteout for /var/lib/apt/lists       
error building image: error building stage: lstat /tmp/apt-key-gpghome.7n7iliD9iR/gnupg_spawn_agent_sentinel.lock: no such file or directory

Expected behavior As it works with docker daemon, I expect that the build is also OK with kaniko without changing the dockerfile.

To Reproduce Use this Dockerfile in a builder-node directory

FROM google/cloud-sdk:262.0.0-slim

RUN apt-get update && apt-get install --yes curl && \
    curl -sL https://deb.nodesource.com/setup_10.x | bash - && \
    apt-get install -y nodejs

RUN echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/chrome.list && \
    curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg |  apt-key add - && \
    echo "deb https://dl.yarnpkg.com/debian/ stable main" |  tee /etc/apt/sources.list.d/yarn.list && \
    apt-get update && \
    apt-get install --no-install-recommends -y software-properties-common && \
    # installation des paquets via apt
    apt-get install --no-install-recommends -y --allow-unauthenticated unzip google-chrome-stable yarn build-essential && \
    apt-get clean && \
    rm -rf /etc/apt/sources.list.d/chrome.list /var/lib/apt/lists

Run it :

run_in_docker.sh Dockerfile $(pwd)/builder-node gcr.io/myproject/builder-node:kaniko false

Additional Information

juicemia commented 4 years ago

I'm seeing the same issue with the debug tag. I'm following the instructions for building images in Gitlab CI.

Running the following:

for dir in $(ls -p contexts | grep -i '/')
do
    tag=$(echo $dir | cut -d'/' -f1)

    echo "building $tag..."
    /kaniko/executor --context $CI_PROJECT_DIR/contexts/$tag/ \
        --dockerfile $CI_PROJECT_DIR/contexts/$tag/Dockerfile \
        --destination $repo/google-chrome:$tag \
        --no-push
done

Folder structure looks like the following:

.
├── CONTRIBUTING.md
├── README.md
├── ci
│   ├── build.sh
│   └── build_and_push.sh
└── contexts
    └── 77.0.3865.90
        └── Dockerfile
fabn commented 4 years ago

Getting same kind of error here:

update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/x-www-browser (x-www-browser) in auto mode
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/gnome-www-browser (gnome-www-browser) in auto mode
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/google-chrome (google-chrome) in auto mode
Processing triggers for libc-bin (2.24-11+deb9u4) ...
INFO[0254] Taking snapshot of full filesystem...        
error building image: error building stage: lstat /tmp/apt-key-gpghome.4lNiMJ5oLl/pubring.kbx: no such file or directory

Any solution?

tejal29 commented 4 years ago

Thanks @fabn, i will take a look at this.

Neonox31 commented 4 years ago

I have the same problem, also with the debug tag :

Setting up google-chrome-stable (76.0.3809.132-1) ...
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/x-www-browser (x-www-browser) in auto mode
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/gnome-www-browser (gnome-www-browser) in auto mode
update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/google-chrome (google-chrome) in auto mode
Setting up liblwp-protocol-https-perl (6.06-2) ...
Setting up libwww-perl (6.15-1) ...
Setting up libxml-parser-perl (2.44-2+b1) ...
Setting up libxml-twig-perl (1:3.50-1) ...
Setting up libnet-dbus-perl (1.1.0-4+b1) ...
Processing triggers for libc-bin (2.24-11+deb9u4) ...

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Reading package lists...
Building dependency tree...
Reading state information...
0 upgraded, 0 newly installed, 0 to remove and 15 not upgraded.

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

INFO[0053] Taking snapshot of full filesystem...        
error building image: error building stage: lstat /tmp/apt-key-gpghome.X3u0zrcS5t/pubring.orig.gpg: no such file or directory

I feel like the error is related to chrome installation inside image but not sure of that.

fabn commented 4 years ago

@Neonox31 that was my first thought but same dockerfile builds in native docker with no issues

Neonox31 commented 4 years ago

@fabn Yes, but I think the conflict is between kaniko and chrome.

EppO commented 4 years ago

I can confirm it's about the chrome install. When I just remove Chrome (google-chrome-unstable in my case) from the list of packages installed by apt-get, kaniko is able to build the image

nielsvanvelzen commented 4 years ago

I'm experiencing a similar issue with a gradle container:

INFO[0091] Taking snapshot of full filesystem...        
error building image: error building stage: Failed to get file info for /root/.kotlin/daemon/kotlin-daemon.2019-10-29T08-34-55.719Z.7338a1b69672dd00d0aa900c1e9f04a7.17450.run: lstat /root/.kotlin/daemon/kotlin-daemon.2019-10-29T08-34-55.719Z.7338a1b69672dd00d0aa900c1e9f04a7.17450.run: no such file or directory

When I enabled debug logging the issue disappeared. I suspect it might be a race condition but I have no experience to further investigate.

tejal29 commented 4 years ago

Sorry @EppO and @fabn this fell off my radar.

tejal29 commented 4 years ago

i will take a look at this tomorrow.

caseycs commented 4 years ago

Exactly the same thing while building huge (3gb) container with gradle&android sdk:

INFO[0454] Taking snapshot of full filesystem...        
error building image: error building stage: lstat /tmp/hsperfdata_root/6383: no such file or directory
HerrmannHinz commented 4 years ago

any progress on this one? can't use kaniko atm for building images from cicd. using kaniko:debug latest image from gcr.io

jandillmann commented 4 years ago

Same error here, I'm trying to build a danlynn/ember-cli image on Gitlab CI…

drshrey commented 4 years ago

@jandillmann @fabn I was getting this same issue when installing google-chrome-stable using Kaniko within GitLab CI. I was able to fix it for now by inserting this statement before doing apt-get install operations.

RUN apt-get clean \
 && cd /var/lib/apt \
 && mv lists lists.old \
 && mkdir -p lists/partial \
 && apt-get update \
 && apt-get upgrade -y   

So now my entire Dockerfile looks like this:

FROM node:12.8.0

ENV JAVA_HOME /usr/lib/jvm/java-8-oracle

RUN apt-get clean \
 && cd /var/lib/apt \
 && mv lists lists.old \
 && mkdir -p lists/partial \
 && apt-get update \
 && apt-get upgrade -y     

RUN cd /tmp && \
    wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb && \
    dpkg -i google-chrome-stable_current_amd64.deb; apt-get update; apt-get install -y -f

RUN mkdir -p /code
WORKDIR /code
COPY . /code

RUN npm install -g -s --no-progress yarn && \
    yarn global add nightwatch && \
    yarn add babel-plugin-add-module-exports babel-preset-es2015 -D && \
    yarn

EXPOSE 4444

I'm not exactly sure why we need to clean out the lists directory beforehand, but the source of the solution comes from the owner of this issue: https://github.com/GoogleContainerTools/kaniko/issues/793

tejal29 commented 4 years ago

Hey @olivierboudet, @jandillmann, @HerrmannHinz , @drshrey, @EppO, @Neonox31, @fabn , @nielsdenissen Sorry this took a lot of time. I have a fix for this and verified your dockerfile on my branch. https://github.com/GoogleContainerTools/kaniko/pull/1000

Thanks a again! Please let me know if you want an image to verify this works on your side.

olivierboudet commented 4 years ago

@tejal29 I am interested to have an image with this fix and the one in #793 if possible to check if my dockerfile is building with it.

thanks

EppO commented 4 years ago

@tejal29: I can test it on my side too if this helps, thanks!

tejal29 commented 4 years ago

Thanks a lot @EppO. I pushed the following images with the fix

gcr.io/kaniko-project/executor:fix_769
gcr.io/kaniko-project/executor:debug_769
tejal29 commented 4 years ago

@olivierboudet i did verify your dockerfile at my end and its works. Do you want to give it a try with these images

gcr.io/kaniko-project/executor:fix_769
gcr.io/kaniko-project/executor:debug_769
drshrey commented 4 years ago

@tejal29 I wasn't able to pull down your debug_769 image for some reason, but fix_769 works for me. Will test it out and let you know if it fixes the issue. Thanks!

EDIT: nevermind, GitLab doesn't support non debug_* images because images require a shell for GitLab CI :(. See https://github.com/GoogleContainerTools/kaniko/issues/430

olivierboudet commented 4 years ago

@olivierboudet i did verify your dockerfile at my end and its works. Do you want to give it a try with these images

gcr.io/kaniko-project/executor:fix_769
gcr.io/kaniko-project/executor:debug_769

yes it works on my side also, thanks

EppO commented 4 years ago

Thanks a lot @EppO. I pushed the following images with the fix

gcr.io/kaniko-project/executor:fix_769
gcr.io/kaniko-project/executor:debug_769

I'm using the kaniko debug image in a GitLab pipeline that needs a shell to inject the custom executor command, and I can't pull gcr.io/kaniko-project/executor:debug_769 for some reasons. It doesn't show up in GCR registry.

tejal29 commented 4 years ago

@EppO and @drshrey i pushed an image without the debug_. Thanks!

docker tag gcr.io/kaniko-project/executor:debug_769 gcr.io/kaniko-project/debug:769
docker push gcr.io/kaniko-project/debug:769
EppO commented 4 years ago

Job succeeded! It works! Thanks a lot for your fix

drshrey commented 4 years ago

Thank you @tejal29, all is well now!

tejal29 commented 4 years ago

Great! This fix will available in next release in about an hour!

tejal29 commented 4 years ago

Sorry, i am going to push the release to monday since could not get done in the am.

Thanks Tejal

tejal29 commented 4 years ago

Hey folks, Release v0.17.0 is now up! Please use the latest image and let us know if you still see this issue!

Thank you for your patience!

bimargulies-google commented 4 years ago

I just hit this ...

INFO[0008] Unpacking rootfs as cmd RUN gcc --version requires it. 
error building image: error building stage: failed to get filesystem from image: mkdir /usr/lib/jvm/default-jvm: file exists

I'm using GCB,

steps:
- name: 'gcr.io/kaniko-project/executor:latest'
  args:
  - --destination=gcr.io/$PROJECT_ID/alpine-emulator
  - --cache=true
  - --cache-ttl=24h
cvgw commented 4 years ago

I just hit this ...

INFO[0008] Unpacking rootfs as cmd RUN gcc --version requires it. 
error building image: error building stage: failed to get filesystem from image: mkdir /usr/lib/jvm/default-jvm: file exists

I'm using GCB,

steps:
- name: 'gcr.io/kaniko-project/executor:latest'
  args:
  - --destination=gcr.io/$PROJECT_ID/alpine-emulator
  - --cache=true
  - --cache-ttl=24h

that may be the same as #830

elnygren commented 4 years ago

@nielsvanvelzen did you ever manage to solve the kotlin daemon issue? I keep having that same error undeterministically in random commits that don't change Cloudbuild, Kaniko, Gradle or Docker configurations at all...

A possible fix may be to disable Kotlin daemon with ENV GRADLE_OPTS -Dkotlin.compiler.execution.strategy="in-process" (in Dockerfile) but I can't yet say as the issue might still arise again later... it is undeterministic :s

nielsvanvelzen commented 4 years ago

@elnygren As I needed the image to work fast I added a sleep command (for 10 seconds) and everything worked fine (hence my suspicion it was a race-condition). I haven't tried it yet with the fixed image.

axot commented 4 years ago

Today I got some strange errors. Any ideas?

Step #1: INFO[0268] Taking snapshot of full filesystem...        
Step #1: INFO[0268] Resolving paths                              
Step #1: INFO[0268] ENV PHP_INI_DIR /usr/local/etc/php           
Step #1: INFO[0268] COPY --from=deps /usr/local/include/ /usr/local/include/ 
Step #1: INFO[0268] Resolving paths                              
Step #1: INFO[0268] Taking snapshot of files...                  
Step #1: INFO[0269] COPY --from=deps /usr/local/share/dd-trace-php /usr/local/share/dd-trace-php 
Step #1: INFO[0269] Resolving paths                              
Step #1: INFO[0269] Taking snapshot of files...                  
Step #1: INFO[0269] COPY --from=deps /usr/local/lib/php/ /usr/local/lib/php/ 
Step #1: INFO[0269] Resolving paths                              
Step #1: INFO[0269] Taking snapshot of files...                  
Step #1: INFO[0271] COPY --from=deps /usr/local/bin /usr/local/bin 
Step #1: error building image: error building stage: failed to execute command: lstat /usr/local/bin/supercronic-linux-amd64: no such file or directory
Finished Step #1
ERROR
ERROR: build step 1 "gcr.io/kaniko-project/executor:latest" failed: step exited with non-zero status: 1
davidschrooten commented 4 years ago

Same problem here

ichaozai commented 4 years ago

This is fixed in v0.17.0. But it is reproduced in v0.18.0.

aca commented 4 years ago

Faced the similar issue with v0.24.0

COPY . .

RUN gradle :applications:backoffice:bootJar --no-daemon
BUILD SUCCESSFUL in 3m 43s
44 actionable tasks: 44 executed
INFO[0282] Taking snapshot of full filesystem...
error building image: error building stage: failed to take snapshot: Failed to get file info for /root/.kotlin/daemon/kotlin-daemon.2020-07-14T05-44-03.523Z.5f6154794ffdd6c37d85778697d6da61.17180.run: lstat /root/.kotlin/daemon/kotlin-daemon.2020-07-14T05-44-03.523Z.5f6154794ffdd6c37d85778697d6da61.17180.run: no such file or directory
k commented 4 years ago

I'm getting this as well with gradle and kotlin

k commented 4 years ago

This workaround worked for kotlin + gradle: https://github.com/GoogleContainerTools/kaniko/issues/769#issuecomment-596195219

tejal29 commented 4 years ago

Thanks @k for confirming the workaround.

beanaroo commented 3 years ago

Getting this error on executor:debug on GitLab CI:

INFO[0020] Taking snapshot of full filesystem...        
INFO[0027] RUN yum clean all                            
INFO[0027] cmd: /bin/sh                                 
INFO[0027] args: [-c yum clean all]                     
INFO[0027] Running: [/bin/sh -c yum clean all]          
Loaded plugins: ovl, priorities
Cleaning repos: amzn2-core
Cleaning up everything
Maybe you want: rm -rf /var/cache/yum, to also free up space taken by orphaned data from disabled or removed repos
INFO[0027] Taking snapshot of full filesystem...        
error building image: error building stage: failed to get files used from context: failed to get fileinfo for /workspace/requirements.txt: lstat /workspace/requirements.txt: no such file or directory

Dockerfile to reproduce:

FROM amazonlinux:latest

# Installing Python3
RUN yum install -y python3

# Clean yum cache
RUN yum clean all

# copy requirements.txt to tmp folder
ADD requirements.txt /tmp

# copy script to system
ADD script.py /usr/bin/

# install script's dependencies
RUN pip3 install -r /tmp/requirements.txt -q
nezed commented 3 years ago

To fix both errors reported by @caseycs and @nielsvanvelzen

error building image: error building stage: lstat /tmp/hsperfdata_root/6383: no such file or directory
error building image: error building stage: Failed to get file info for /root/.kotlin/daemon/kotlin-daemon.2019-10-29T08-34-55.719Z.7338a1b69672dd00d0aa900c1e9f04a7.17450.run: lstat /root/.kotlin/daemon/kotlin-daemon.2019-10-29T08-34-55.719Z.7338a1b69672dd00d0aa900c1e9f04a7.17450.run: no such file or directory

that are common for many java/kotlin projects, just add both following env variables to your Dockerfile:

ENV GRADLE_OPTS -Dkotlin.compiler.execution.strategy="in-process"
ENV JAVA_OPTS -XX:-UsePerfData
nezed commented 3 years ago

Looks like the whole issue is just about kaniko design problems. I can't realise that the issue is closed while it still active and described problem still reproducing.

Build tool that requires all the filesystem outside the build context to be consistent looks awful for me. Every time this tool will meets race conditions, because the reality is different. And this is looks like a real design problem for kaniko

Probably the better alternative is already existing, i hope you will find something for you https://blog.alexellis.io/building-containers-without-docker/

nezed commented 3 years ago

Also the [Experimental] New Run command implementation may help, but i don't know it for sure https://github.com/GoogleContainerTools/kaniko/releases/tag/v1.0.0

xi2817-aajgaonkar commented 2 years ago
args:
          - "--dockerfile=Dockerfile"
          - "--context=dir://./"

Worked for me

naXa777 commented 2 years ago

I'm experiencing this problem in Digital Ocean. Builds fail sporadically. Does it mean that Digital Ocean is using Kaniko to build my app from a Dockerfile?

HeneryHawk commented 1 year ago

I unfortunately also got this error for a Kotlin Gradle build in GitLab CI.

BUILD SUCCESSFUL in 1m 33s
43 actionable tasks: 39 executed, 4 up-to-date
INFO[0111] Taking snapshot of full filesystem...        
error building image: error building stage: failed to take snapshot: Failed to get file info for /root/.kotlin/daemon/kotlin-daemon.2022-11-09T10-27-49.821Z.db33284ca361ae0fea07ee7f3786b133.17811.run: lstat /root/.kotlin/daemon/kotlin-daemon.2022-11-09T10-27-49.821Z.db33284ca361ae0fea07ee7f3786b133.17811.run: no such file or directory
FROM gradle:7.4-jdk17-alpine AS GRADLE_TOOL_CHAIN

WORKDIR /home/gradle/src

COPY --chown=gradle:gradle . /home/gradle/src
RUN gradle --no-daemon --refresh-dependencies clean build

any updates on this?

penn5 commented 1 year ago

Still encountering this fairly often with Kotlin. If kaniko fails to snapshot a file because it doesn't exist, it should just ignore this, as the file no longer exists, so no longer needs snapshotting... That said, my intuition would be that the issue is caused by kaniko not waiting for all processes in the RUN to be fully stopped. It should probably ensure these are properly signalled to stop then waited upon. A quick look at the source code would suggest to me that the gradle daemon, kotlin daemon, etc are changing their process group, meaning that https://github.com/GoogleContainerTools/kaniko/blob/main/pkg/commands/run.go#L124C23-L135 doesn't kill them. Update, yep, the gradle daemon changes its pgid: 26006 (java) S 4275 26006 26006 0 -1 4194304 126429 0 13 0 2378 72 0 0 20 0 53 0 2861752 5986787328 120556 18446744073709551615 94751120973824 94751120974905 140721562476352 0 0 0 4 0 16800975 0 0 0 17 2 0 0 0 0 0 94751120985408 94751120986128 94751136600064 140721562480465 140721562481353 140721562481353 140721562484665 0 (the gradle daemon /proc/pid/stat) shows that the PGID is 26006, the same as the PID. So to fix this issue, kaniko should kill all children of the launched process, not just those who remain in the PG. I don't know how to accomplish that, though.

patsevanton commented 11 months ago

@HeneryHawk Are you fix you error? I have same error

jrkessl commented 9 months ago

Issue closed, but the bug is still there.

INFO[2023-12-06T03:43:03Z] Taking snapshot of full filesystem...        
error building image: error building stage: failed to take snapshot: Failed to get file info for /tmp/ssh-XXXXXXHNLnGp: lstat /tmp/ssh-XXXXXXHNLnGp: no such file or directory

It is very intermittent. Sometimes just one retry is enough, and the next execution is successful. Today I had a case where we retried multiple times, and several hours later it eventually started working again. The mitigation factors that we applied were removed, but it kept succeeding, which shows we did not fix it, it just fixed itself after some hours.

sharifahmad2061 commented 8 months ago

Getting the same error with Gradle 8.3 and Kaniko 1.18.0-debug. As mentioned above by other people, it is very intermittent. Changed kaniko to 1.19.0-debug but that didn't fix it as well.

BUILD SUCCESSFUL in 1m 53s
9 actionable tasks: 7 executed, 2 from cache
INFO[0130] Taking snapshot of full filesystem...        
error building image: error building stage: failed to take snapshot: Failed to get file info for /root/.kotlin/daemon/kotlin-daemon.2024-01-04T04-16-46.495Z.7047e3547be7dde9af9d53c50c049451.17789.run: lstat /root/.kotlin/daemon/kotlin-daemon.2024-01-04T04-16-46.495Z.7047e3547be7dde9af9d53c50c049451.17789.run: no such file or directory

Before this change we had gradle 7.6 and kaniko 1.9.1-debug and it was working fine but with the version update to gradle 8.3 and kaniko 1.18 it is not working fine anymore.