GoogleContainerTools / kaniko

Build Container Images In Kubernetes
Apache License 2.0
14.32k stars 1.4k forks source link

Gitlab CI failed due to Taking snapshot of the full system for more than 1 hour #1516

Open Kiddinglife opened 3 years ago

Kiddinglife commented 3 years ago

Actual behavior While building image using gcr.io/kaniko-project/executor:debug in gitlab CI runner hosted on kubernetes using helm chart the image build process freezes on Taking snapshot of full filesystem... for the time till the runner timeouts(1 hr) This behaviour is intermittent as for the same project image build stage works sometimes

Issue arises in multistage as well as single stage Dockerfile.

Expected behavior Image build should not freeze at Taking snapshot of full filesystem... and should be successful everytime.

To Reproduce As the behaviour is intermittent not sure how it can be reproduced

stage: build_xxx
  image:
    name: gcr.io/kaniko-project/executor:debug
    entrypoint: [""]
  variables:
    GIT_SUBMODULE_STRATEGY: none
  script:
    - echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json
    - image_dest="${CI_REGISTRY}/${CI_xxx_IMAGE_NAME}:${CI_COMMIT_SHA}"
    - /kaniko/executor --context $CI_PROJECT_DIR/lobby--dockerfile $CI_PROJECT_DIR/xxx/Dockerfile --destination $image_dest --cache=true --snapshotMode=redo--verbosity=debug
  except:
    - tags
tejal29 commented 3 years ago

Sorry about the flakiness. Can you run kaniko with trace logs ? I will also add a PR to limit the snapshotting to a reasonable time in the next release.

Cameronsplaze commented 3 years ago

In the /kaniko/executor line, you don't have a space before the "--dockerfile" line. I'd be surprised if that gave this error, that'd stop it from getting the right context/loading the docker file. Just in case though :)

bmalynovytch commented 2 years ago

Same erreur using Kaniko with Gitlab CI and the following command:

/kaniko/executor \
  --cache=true \
  --cache-ttl=168h \
  --context $CI_PROJECT_DIR \
  --build-arg GIT_COMMIT_SHORT_SHA=$CI_COMMIT_SHORT_SHA \
  --build-arg REPO_AUTH_JSON=$repo_auth_json \
  --dockerfile $CI_PROJECT_DIR/docker-k8s/Dockerfile \
  --destination $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA

Tried with image executor:v1.6.0-debug and executor:v1.8.0-debug without success. It used to be working like a charm with executor:debug but not anymore.

bmalynovytch commented 2 years ago

It seems that removing --cache and --cache-ttl somehow did the trick 🤨

roelzkie15 commented 1 year ago

It seems that removing --cache and --cache-ttl somehow did the trick 🤨

It works for me when removing --cache flag and using the v1.9.0-debug

cforce commented 1 year ago

Same issue here with 1.9.0-debug and 1.9.1-debug - but i even don't have any "--cache*" params, but still does not work 90550K .......... .......... .......... .......... .......... 99% 50.1M 0s 90600K .......... .......... .......... .......... .......... 99% 101M 0s 90650K .......... .......... .......... .......... .......... 99% 115M 0s 90700K .......... .......... .......... .......... .......... 99% 113M 0s 90750K ... 100% 26.4M=1.3s 2022-11-02 11:50:24 (66.9 MB/s) - ‘google-chrome-stable_current_amd64.deb’ saved [92931088/92931088] Selecting previously unselected package google-chrome-stable. (Reading database ... 95372 files and directories currently installed.) Preparing to unpack google-chrome-stable_current_amd64.deb ... Unpacking google-chrome-stable (107.0.5304.87-1) ... Setting up google-chrome-stable (107.0.5304.87-1) ... update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/x-www-browser (x-www-browser) in auto mode update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/gnome-www-browser (gnome-www-browser) in auto mode update-alternatives: using /usr/bin/google-chrome-stable to provide /usr/bin/google-chrome (google-chrome) in auto mode Processing triggers for mime-support (3.64ubuntu1) ... INFO[1212] Taking snapshot of full filesystem...
Cleaning up project directory and file based variables 00:00 ERROR: Job failed: pod "runner-1m9jdyko-project-36366634-concurrent-0gknmx" status is "Failed"