Open abhi1git opened 4 years ago
Can you please elaborate snapshot for which filesystem is being taken while building image so that we can see if filesystem size is causing this issue. we are using kaniko to build images in gitlab cicd and runner is deployed on kubernetes using helm chart.
Preiously this issue used to arise randomly but all of our kaniko build image jobs get freeze on Taking snapshot of full filesystem...
@tejal29
@abhi1git can you try the newer snapshot modes --snapshotMode=redo
?
@abhi1git please switch to using --snapshotMode=redo
. See comments here https://github.com/GoogleContainerTools/kaniko/issues/1305#issuecomment-672752902
@abhi1git please switch to using --snapshotMode=redo. See comments here #1305 (comment) I suffered from the same issue and --snapshotMode=redo did not resolve the issue. @abhi1git do you get it work ?
@Kiddinglife Can you provide your dockerfile or some stats on number of files in your repo?
I am experience this problem while building an image with less than a gb. Interesting that it fails silently. GitLab CI job will be marked as successfull but no image is actually pushed.
We are using kaniko for several other projects but this error only happens on two projects. Both are monorepos and use lerna for extending yarn commands to sub packages.
I must say it was working at some point and it does work normally when using docker to build the image
Here is a snippet of the build logs:
INFO[0363] RUN yarn install --network-timeout 100000
INFO[0363] cmd: /bin/sh
INFO[0363] args: [-c yarn install --network-timeout 100000]
INFO[0363] util.Lookup returned: &{Uid:1000 Gid:1000 Username:node Name: HomeDir:/home/node}
INFO[0363] performing slow lookup of group ids for node
INFO[0363] Running: [/bin/sh -c yarn install --network-timeout 100000]
yarn install v1.22.5
info No lockfile found.
[1/4] Resolving packages...
INFO[0368] Pushed image to 1 destinations
... A bunch of yarn logs ...
[4/4] Building fresh packages...
success Saved lockfile.
$ lerna bootstrap
lerna notice cli v3.22.1
lerna info bootstrap root only
yarn install v1.22.5
[1/4] Resolving packages...
success Already up-to-date.
$ lerna bootstrap
lerna notice cli v3.22.1
lerna WARN bootstrap Skipping recursive execution
Done in 20.00s.
Done in 616.92s.
INFO[0982] Taking snapshot of full filesystem...
Interesting to note that RUN yarn install --network-timeout 100000
is not the last step in the dockerfile.
neither --snapshotMode=redo
nor --use-new-run
solved the problem
same issue , nothing changed only version of kaniko
I'm hitting the same problem, tried --snapshotMode=redo, but it does not always help. What will help us resolve the issue here? Does reproducible dockerfile + number of files help with debugging? I'm trying --use-new-run now.
Adding a data point, I was initially observing the build process freezing problem, when I do not add any memory/cpu request/limits. Then I added memory/cpu request & limits, the process starts to OOM. I increased memory limit to 6GB, but it still reaches OOM killed. When looking at the memory usage, it skyrockets at the end -- when log reaches taking snapshot of file system. EDIT: I tried building the same image in local docker, and maximum memory usage is less than 1GB.
logs
+ dockerfile=v2/container/driver/Dockerfile
+ context_uri=
+ context_artifact_path=/tmp/inputs/context_artifact/data
+ context_sub_path=
+ destination=gcr.io/kfp-ci/4674c4982ab8fcf476e610f372fc0e4a38686805/v2-sample-test/test/kfp-driver
+ digest_output_path=/tmp/outputs/digest/data
+ cache=true
+ cache_ttl=24h
+ context=
+ '[[' '!=' ]]
+ context=dir:///tmp/inputs/context_artifact/data
+ dirname /tmp/outputs/digest/data
+ mkdir -p /tmp/outputs/digest
+ /kaniko/executor --dockerfile v2/container/driver/Dockerfile --context dir:///tmp/inputs/context_artifact/data --destination gcr.io/kfp-ci/4674c4982ab8fcf476e610f372fc0e4a38686805/v2-sample-test/test/kfp-driver --snapshotMode redo --image-name-with-digest-file /tmp/outputs/digest/data '--cache=true' '--cache-ttl=24h'
E0730 12:20:40.314406 21 aws_credentials.go:77] while getting AWS credentials NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
[36mINFO[0m[0000] Resolved base name golang:1.15-alpine to builder
[36mINFO[0m[0000] Using dockerignore file: /tmp/inputs/context_artifact/data/.dockerignore
[36mINFO[0m[0000] Retrieving image manifest golang:1.15-alpine
[36mINFO[0m[0000] Retrieving image golang:1.15-alpine from registry index.docker.io
E0730 12:20:40.518068 21 metadata.go:166] while reading 'google-dockercfg-url' metadata: http status code: 404 while fetching url
http://metadata.google.internal./computeMetadata/v1/instance/attributes/google-dockercfg-url
[36mINFO[0m[0001] Retrieving image manifest golang:1.15-alpine
[36mINFO[0m[0001] Returning cached image manifest
[36mINFO[0m[0001] No base image, nothing to extract
[36mINFO[0m[0001] Built cross stage deps: map[0:[/build/v2/build/driver]]
[36mINFO[0m[0001] Retrieving image manifest golang:1.15-alpine
[36mINFO[0m[0001] Returning cached image manifest
[36mINFO[0m[0001] Retrieving image manifest golang:1.15-alpine
[36mINFO[0m[0001] Returning cached image manifest
[36mINFO[0m[0001] Executing 0 build triggers
[36mINFO[0m[0001] Checking for cached layer gcr.io/kfp-ci/4674c4982ab8fcf476e610f372fc0e4a38686805/v2-sample-test/test/kfp-driver/cache:9164be18ba887abd9388518d533d79a6e2fda9f81f33e57e0c71319d7a6da78e...
[36mINFO[0m[0001] No cached layer found for cmd RUN apk add --no-cache make bash
[36mINFO[0m[0001] Unpacking rootfs as cmd RUN apk add --no-cache make bash requires it.
[36mINFO[0m[0009] RUN apk add --no-cache make bash
[36mINFO[0m[0009] Taking snapshot of full filesystem...
[36mINFO[0m[0016] cmd: /bin/sh
[36mINFO[0m[0016] args: [-c apk add --no-cache make bash]
[36mINFO[0m[0016] Running: [/bin/sh -c apk add --no-cache make bash]
fetch
https://dl-cdn.alpinelinux.org/alpine/v3.14/main/x86_64/APKINDEX.tar.gz
fetch
https://dl-cdn.alpinelinux.org/alpine/v3.14/community/x86_64/APKINDEX.tar.gz
(1/5) Installing ncurses-terminfo-base (6.2_p20210612-r0)
(2/5) Installing ncurses-libs (6.2_p20210612-r0)
(3/5) Installing readline (8.1.0-r0)
(4/5) Installing bash (5.1.4-r0)
Executing bash-5.1.4-r0.post-install
(5/5) Installing make (4.3-r0)
Executing busybox-1.33.1-r2.trigger
OK: 9 MiB in 20 packages
[36mINFO[0m[0016] Taking snapshot of full filesystem...
[36mINFO[0m[0017] Pushing layer gcr.io/kfp-ci/4674c4982ab8fcf476e610f372fc0e4a38686805/v2-sample-test/test/kfp-driver/cache:9164be18ba887abd9388518d533d79a6e2fda9f81f33e57e0c71319d7a6da78e to cache now
[36mINFO[0m[0017] WORKDIR /build
[36mINFO[0m[0017] cmd: workdir
[36mINFO[0m[0017] Changed working directory to /build
[36mINFO[0m[0017] Creating directory /build
[36mINFO[0m[0017] Taking snapshot of files...
[36mINFO[0m[0017] Pushing image to gcr.io/kfp-ci/4674c4982ab8fcf476e610f372fc0e4a38686805/v2-sample-test/test/kfp-driver/cache:9164be18ba887abd9388518d533d79a6e2fda9f81f33e57e0c71319d7a6da78e
[36mINFO[0m[0017] COPY api/go.mod api/go.sum api/
[36mINFO[0m[0017] Taking snapshot of files...
[36mINFO[0m[0017] COPY v2/go.mod v2/go.sum v2/
[36mINFO[0m[0017] Taking snapshot of files...
[36mINFO[0m[0017] RUN cd v2 && go mod download
[36mINFO[0m[0017] cmd: /bin/sh
[36mINFO[0m[0017] args: [-c cd v2 && go mod download]
[36mINFO[0m[0017] Running: [/bin/sh -c cd v2 && go mod download]
[36mINFO[0m[0018] Pushed image to 1 destinations
[36mINFO[0m[0140] Taking snapshot of full filesystem...
Killed
version: gcr.io/kaniko-project/executor:v1.6.0-debug args: I added snapshotMode redo, cache=true env: GKE 1.19, use kubeflow pipelies to run kaniko containers
I guess the root cause is actually insufficient memory, but when we do not allocate enough memory it will freeze on taking snapshot of full filesystem...
as a symptom.
Edit: my guess is wrong, I reverted to kaniko:1.3.0-debug and added enough memory requests & limit, but I'm still observing the image build freezing problem from time to time.
Hi @abhi1git, did you find a solution for your issue ? I am facing the same.
The issue is still actual for me too. Any updates?
Same issue here, the system has enough memory (not hitting any memory limits), --snapshotMode=redo and --use-new-run are not changing the behavior at all, I do not see any problems when using trace verbosity. I am currently using 1.17.0-debug
Hi @abhi1git, did you find a solution for your issue ? I am facing the same.
For us, after investigations, we found that the WAF in front of our Gitlab was blocking the requests. After whitelisting it, all is working fine.
Still an issue, can you reopen @tejal29? Building an image like this shouldn't be OOMKilling/using GBs of RAM - seems like a clear cut bug to me.
Hi @abhi1git, did you find a solution for your issue ? I am facing the same.
For us, after investigations, we found that the WAF in front of our Gitlab was blocking the requests. After whitelisting it, all is working fine.
What kind of whitelisting was required for this? Can you help me to clarify how to set it up?
Hi @abhi1git, did you find a solution for your issue ? I am facing the same.
For us, after investigations, we found that the WAF in front of our Gitlab was blocking the requests. After whitelisting it, all is working fine.
What kind of whitelisting was required for this? Can you help me to clarify how to set it up?
If you have a WAF in front of Gitlab, It would be good to check your logs and confirm what kind of requests is blocking first.
Anyone tried with version 1.7.0?
Anyone tried with version 1.7.0?
v1.7.0 is about 4 months old, and had some showstopper auth issues, and :latest currently points to :v1.6.0, so I would guess that not many folks are using :v1.7.0
Instead, while we wait for v1.8.0 (#1871) you can try a commit-tagged image, the latest of which is currently :09e70e44d9e9a3fecfcf70cb809a654445837631
Thanks @imjasonh I'm going to try gcr.io/kaniko-project/executor:09e70e44d9e9a3fecfcf70cb809a654445837631-debug
I've tried gcr.io/kaniko-project/executor:09e70e44d9e9a3fecfcf70cb809a654445837631-debug
with --snapshotMode=redo --use-new-run
, my pipeline is still stuck in
INFO[0009] Taking snapshot of full filesystem...
Guess the only solution is waiting for another commit-tagged image or 1.8.0 to be released
Guess the only solution is waiting for another commit-tagged image or 1.8.0 to be released
It sounds like whatever bug is causing that is still present, so it won't be fixed by releasing the latest image as v1.8.0. We just need someone to figure out why it gets stuck and fix it.
Unfortunately Kaniko is not really actively staffed at the moment, so it's probably going to fall to you or me or some other kind soul reading this to investigate and get us back on the track to solving this. Any takers?
It sounds like whatever bug is causing that is still present, so it won't be fixed by releasing the latest image as v1.8.0. We just need someone to figure out why it gets stuck and fix it.
Hold on a second, maybe I spoke early!
My pipeline currently builds multiple images in parallel.
I didn't realize before that one of them that before was stuck in taking snapshot now goes on smoothly with --snapshotMode=redo --use-new-run
and gcr.io/kaniko-project/executor:09e70e44d9e9a3fecfcf70cb809a654445837631-debug
.
The images actually stuck are basically the same Postgres image built with different build-arg
values, so this ends up by running in parallel (and caching in parallel) the same layers.
I consequently tried to remove this parallelism and tried to build these Postgres images in sequence. I ended up with Postgres images stuck in taking snapshot in parallel with a totally different NodeJs image, also stuck in taking snapshots.
So from my tests it looks like when building images happens in parallel against the same registry mirror used as cache, if one image is taking snapshots in parallel with another it gets stuck.
It may be a coincidence, maybe not. I repeat: this is from my tests, it could be totally unrelated to the problem
same issue:
containers:
- args:
- --dockerfile=/workspace/Dockerfile
- --context=dir:///workspace/
- --destination=xxxx/xxx/xxx:1.0.0
- --skip-tls-verify
- --verbosity=debug
- --build-arg="http_proxy='http://xxxx'"
- --build-arg="https_proxy='http://xxxx'"
- --build-arg="HTTP_PROXY='http://xxxx'"
- --build-arg="HTTPS_PROXY='http://xxxx'"
image: gcr.io/kaniko-project/executor:v1.7.0
imagePullPolicy: IfNotPresent
name: kaniko
volumeMounts:
- mountPath: /kaniko/.docker
name: secret
- mountPath: /workspace
name: code
here are some logs, maybe useful
......
DEBU[0021] Whiting out /usr/share/doc/linux-libc-dev/.wh..wh..opq
DEBU[0021] not including whiteout files
DEBU[0021] Whiting out /usr/share/doc/make/.wh..wh..opq
DEBU[0021] not including whiteout files
DEBU[0021] Whiting out /usr/share/doc/pkg-config/.wh..wh..opq
DEBU[0021] not including whiteout files
DEBU[0021] Whiting out /usr/share/gdb/auto-load/lib/.wh..wh..opq
DEBU[0021] not including whiteout files
DEBU[0021] Whiting out /usr/share/glib-2.0/.wh..wh..opq
DEBU[0021] not including whiteout files
DEBU[0021] Whiting out /usr/share/perl5/Dpkg/.wh..wh..opq
DEBU[0021] not including whiteout files
DEBU[0021] Whiting out /usr/share/pkgconfig/.wh..wh..opq
DEBU[0021] not including whiteout files
DEBU[0021] Whiting out /usr/local/go/.wh..wh..opq
DEBU[0021] not including whiteout files
DEBU[0030] Whiting out /go/.wh..wh..opq
DEBU[0030] not including whiteout files
INFO[0030] ENV GOPRIVATE "gitee.com/dmcca/*"
DEBU[0030] build: skipping snapshot for [ENV GOPRIVATE "gitee.com/dmcca/*"]
INFO[0030] ENV GOPROXY "https://goproxy.cn,direct"
DEBU[0030] build: skipping snapshot for [ENV GOPROXY "https://goproxy.cn,direct"]
DEBU[0030] Resolved ./.netrc to .netrc
DEBU[0030] Resolved /root/.netrc to /root/.netrc
DEBU[0030] Getting files and contents at root /workspace/ for /workspace/.netrc
DEBU[0030] Using files from context: [/workspace/.netrc]
INFO[0030] COPY ./.netrc /root/.netrc
DEBU[0030] Resolved ./.netrc to .netrc
DEBU[0030] Resolved /root/.netrc to /root/.netrc
DEBU[0030] Getting files and contents at root /workspace/ for /workspace/.netrc
DEBU[0030] Copying file /workspace/.netrc to /root/.netrc
INFO[0030] Taking snapshot of files...
DEBU[0030] Taking snapshot of files [/root/.netrc / /root]
INFO[0030] RUN chmod 600 /root/.netrc
INFO[0030] Taking snapshot of full filesystem...
Same issue here:
INFO[0163] Taking snapshot of full filesystem...
fatal error: runtime: out of memory
runtime stack:
runtime.throw({0x12f3614, 0x16})
/usr/local/go/src/runtime/panic.go:1198 +0x54
runtime.sysMap(0x4041c00000, 0x20000000, 0x220fdd0)
/usr/local/go/src/runtime/mem_linux.go:169 +0xbc
<...>
github.com/google/go-containerregistry/pkg/v1/tarball.WithCompressedCaching.func1()
/src/vendor/github.com/google/go-containerregistry/pkg/v1/tarball/layer.go:119 +0x6c fp=0x40005d3b10 sp=0x40005d3a80 pc=0xa6134c
github.com/google/go-containerregistry/pkg/v1/tarball.computeDigest(0x40008a5d70)
/src/vendor/github.com/google/go-containerregistry/pkg/v1/tarball/layer.go:278 +0x44 fp=0x40005d3b80 sp=0x40005d3b10 pc=0xa624e4
github.com/google/go-containerregistry/pkg/v1/tarball.LayerFromOpener(0x400000d2c0, {0x40005d3cf8, 0x1, 0x1})
/src/vendor/github.com/google/go-containerregistry/pkg/v1/tarball/layer.go:247 +0x3f4 fp=0x40005d3c20 sp=0x40005d3b80 pc=0xa62174
github.com/google/go-containerregistry/pkg/v1/tarball.LayerFromFile({0x4000a22018, 0x12}, {0x40005d3cf8, 0x1, 0x1})
/src/vendor/github.com/google/go-containerregistry/pkg/v1/tarball/layer.go:188 +0x8c fp=0x40005d3c70 sp=0x40005d3c20 pc=0xa61cbc
github.com/GoogleContainerTools/kaniko/pkg/executor.pushLayerToCache(0x21d93a0, {0x40008b75c0, 0x40}, {0x4000a22018, 0x12}, {0x400016d940, 0x3a})
/src/pkg/executor/push.go:295 +0x68 fp=0x40005d3ee0 sp=0x40005d3c70 pc=0xf1d4a8
github.com/GoogleContainerTools/kaniko/pkg/executor.(*stageBuilder).build.func3()
/src/pkg/executor/build.go:425 +0xa4 fp=0x40005d3f60 sp=0x40005d3ee0 pc=0xf16474
<...>
compress/gzip.(*Writer).Write(0x40006780b0, {0x40014f6000, 0x8000, 0x8000})
/usr/local/go/src/compress/gzip/gzip.go:196 +0x388
io.copyBuffer({0x1678960, 0x40006780b0}, {0x167dfe0, 0x40006163e8}, {0x0, 0x0, 0x0})
/usr/local/go/src/io/io.go:425 +0x224
io.Copy(...)
/usr/local/go/src/io/io.go:382
github.com/google/go-containerregistry/internal/gzip.ReadCloserLevel.func1(0x400064be80, 0x1, 0x40006163f8, {0x16902e0, 0x40006163e8})
/src/vendor/github.com/google/go-containerregistry/internal/gzip/zip.go:60 +0xb4
created by github.com/google/go-containerregistry/internal/gzip.ReadCloserLevel
/src/vendor/github.com/google/go-containerregistry/internal/gzip/zip.go:52 +0x230
Docker works fine (yet requires privileged mode).
The stack traces I pasted are from 1.8.0.
@max-au yours looks like a different problem though
INFO[0163] Taking snapshot of full filesystem... fatal error: runtime: out of memory
This is an out of memory error while the problem reported here is that the build just freezes and doesn't show any error or progress
Maybe setting this to false will help:
https://github.com/GoogleContainerTools/kaniko#--compressed-caching
we could fix the gitlab cicd pipeline error
Taking snapshot of full filesystem....
Killed
with --compressed-caching=false
and v1.8.0-debug
. The image is around 2 GB. Alpine reported around 4 GB in around 100 packages.
Had the same issue when running on a small demo environment.
Kubectl top pods showed 6633Mi memory consumption.
Issue went away by running the build on a "real" cluster, I did not fiddle with compression params, but I use caching.
Just curious why it failed with exit code 1 and it does not show the usual OOMKilled. This makes it really hard to find out the root cause.
we could fix the gitlab cicd pipeline error
Taking snapshot of full filesystem.... Killed
with
--compressed-caching=false
andv1.8.0-debug
. The image is around 2 GB. Alpine reported around 4 GB in around 100 packages.
Thanks @baslr, this worked for me.
We started to have this problem in the last few days within out GitLab CI. The workarounds did not work for us. After discovering the version tag syntax in the GitLab documentation (https://docs.gitlab.com/ee/ci/docker/using_kaniko.html) we switched to gcr.io/kaniko-project/executor:v1.8.0-debug
and the problem effectively disappeared.
We started to have this problem in the last few days within out GitLab CI. The workarounds did not work for us. After discovering the version tag syntax in the GitLab documentation (https://docs.gitlab.com/ee/ci/docker/using_kaniko.html) we switched to
gcr.io/kaniko-project/executor:v1.8.0-debug
and the problem effectively disappeared.
Seems I was too fast. The problem persists (at least in some jobs)
We started to have this problem in the last few days within out GitLab CI. The workarounds did not work for us. After discovering the version tag syntax in the GitLab documentation (https://docs.gitlab.com/ee/ci/docker/using_kaniko.html) we switched to
gcr.io/kaniko-project/executor:v1.8.0-debug
and the problem effectively disappeared.Seems I was too fast. The problem persists (at least in some jobs)
same problem: INFO[0170] Taking snapshot of full filesystem... ERROR: Job failed: pod "runner-xxxxxxx" status is "Failed"
But the problem disappears when i dropped some packages from poetry that i need, when add again, this problem come back. Tried delete different poetry packages and there is no dependency when it will crash. If you manage to solve it let us know please, ty
We started to have this problem in the last few days within out GitLab CI. The workarounds did not work for us. After discovering the version tag syntax in the GitLab documentation (https://docs.gitlab.com/ee/ci/docker/using_kaniko.html) we switched to
gcr.io/kaniko-project/executor:v1.8.0-debug
and the problem effectively disappeared.Seems I was too fast. The problem persists (at least in some jobs) @stranljip @max-au @irizzant @chenlein @pY4x3g
SOLVED PROBLEM!!!! This error caused by ci/ci runner, cause it haven't sufficient space (memory on runner pod) to save its cashes and other staff while building image. Space can be adjusted with parameter: ephemeral storage You can read more about it here: https://docs.openshift.com/container-platform/4.7/storage/understanding-ephemeral-storage.html
Just increased from 4GB to 6GB and all issues gone. All pipelines succeded!
I am having the same problem when using Kaniko in GitLab CI.
The workarounds didn't work, it times out (stuck in INFO[0744] Taking snapshot of full filesystem...
)
This is the stage/job I'm using. As you can see, it uses version 1.8.0-debug and the --compressed-caching=false
The .gitlab-ci.yml
stages:
- delivery
container_registry:
stage: delivery
image:
name: gcr.io/kaniko-project/executor:v1.8.0-debug
entrypoint: [""]
before_script:
- IMAGE_TAG=$CI_COMMIT_SHORT_SHA
- |-
cat << EOF > $CI_PROJECT_DIR/.dockerignore
bin/
obj/
EOF
- cat $CI_PROJECT_DIR/.dockerignore
- |-
cat << EOF > /kaniko/.docker/config.json
{
"auths": {
"$CI_REGISTRY": {
"username": "$CI_REGISTRY_USER",
"password": "$CI_REGISTRY_PASSWORD"
}
}
}
EOF
- cat /kaniko/.docker/config.json
script:
- /kaniko/executor
--context $CI_PROJECT_DIR
--dockerfile $CI_PROJECT_DIR/Dockerfile
--destination $CI_REGISTRY_IMAGE:latest
--destination $CI_REGISTRY_IMAGE:$IMAGE_TAG
--compressed-caching=false
--verbosity=debug
rules:
- if: $CI_COMMIT_TAG
when: never
- when: on_success
The specific Dockerfile I'm testing it with is a large one (don't know how to make it smaller!) and it has an ubuntu base OS with .NET SDK, Android SDK, JDK plus some tools (to help me build .NET Maui apps targeting Android).
ARG REPO=mcr.microsoft.com/dotnet/aspnet
FROM $REPO:7.0.1-jammy-amd64 AS platform
ENV \
# Unset ASPNETCORE_URLS from aspnet base image
ASPNETCORE_URLS= \
# Do not generate certificate
DOTNET_GENERATE_ASPNET_CERTIFICATE=false \
# Do not show first run text
DOTNET_NOLOGO=true \
# SDK version
DOTNET_SDK_VERSION=7.0.101 \
# Enable correct mode for dotnet watch (only mode supported in a container)
DOTNET_USE_POLLING_FILE_WATCHER=true \
# Skip extraction of XML docs - generally not useful within an image/container - helps performance
NUGET_XMLDOC_MODE=skip \
# PowerShell telemetry for docker image usage
POWERSHELL_DISTRIBUTION_CHANNEL=PSDocker-DotnetSDK-Ubuntu-22.04
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
curl \
git \
wget \
&& rm -rf /var/lib/apt/lists/*
# Install .NET SDK
RUN curl -fSL --output dotnet.tar.gz https://dotnetcli.azureedge.net/dotnet/Sdk/$DOTNET_SDK_VERSION/dotnet-sdk-$DOTNET_SDK_VERSION-linux-x64.tar.gz \
&& dotnet_sha512='cf289ad0e661c38dcda7f415b3078a224e8347528448429d62c0f354ee951f4e7bef9cceaf3db02fb52b5dd7be987b7a4327ca33fb9239b667dc1c41c678095c' \
&& echo "$dotnet_sha512 dotnet.tar.gz" | sha512sum -c - \
&& mkdir -p /usr/share/dotnet \
&& tar -oxzf dotnet.tar.gz -C /usr/share/dotnet ./packs ./sdk ./sdk-manifests ./templates ./LICENSE.txt ./ThirdPartyNotices.txt \
&& rm dotnet.tar.gz \
# Trigger first run experience by running arbitrary cmd
&& dotnet help
# Install PowerShell global tool
RUN powershell_version=7.3.0 \
&& curl -fSL --output PowerShell.Linux.x64.$powershell_version.nupkg https://pwshtool.blob.core.windows.net/tool/$powershell_version/PowerShell.Linux.x64.$powershell_version.nupkg \
&& powershell_sha512='c4a72142e2bfae0c2a64a662f1baa27940f1db8a09384c90843163e339581d8d41824145fb9f79c680f9b7906043365e870d48d751ab8809c15a590f47562ae6' \
&& echo "$powershell_sha512 PowerShell.Linux.x64.$powershell_version.nupkg" | sha512sum -c - \
&& mkdir -p /usr/share/powershell \
&& dotnet tool install --add-source / --tool-path /usr/share/powershell --version $powershell_version PowerShell.Linux.x64 \
&& dotnet nuget locals all --clear \
&& rm PowerShell.Linux.x64.$powershell_version.nupkg \
&& ln -s /usr/share/powershell/pwsh /usr/bin/pwsh \
&& chmod 755 /usr/share/powershell/pwsh \
# To reduce image size, remove the copy nupkg that nuget keeps.
&& find /usr/share/powershell -print | grep -i '.*[.]nupkg$' | xargs rm
# JAVA
RUN apt-get update && \
apt-get install -y openjdk-11-jdk && \
rm -rf /var/lib/apt/lists/*
ENV JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/
# Install workload maui
RUN dotnet workload install maui-android --ignore-failed-sources
# Utils
RUN apt-get update && apt-get install -y \
unzip \
jq \
bzip2 \
libzip4 \
libzip-dev && \
rm -rf /var/lib/apt/lists/*
# Install Android SDK
RUN mkdir -p /usr/lib/android-sdk/cmdline-tools/latest && \
curl -k "https://dl.google.com/android/repository/commandlinetools-linux-9123335_latest.zip" -o commandlinetools-linux.zip && \
unzip -q commandlinetools-linux.zip -d /usr/lib/android-sdk/tmp && \
mv /usr/lib/android-sdk/tmp/cmdline-tools/* /usr/lib/android-sdk/cmdline-tools/latest && \
rm -rf /usr/lib/android-sdk/tmp/ && \
rm commandlinetools-linux.zip
ENV ANDROID_SDK_ROOT=/usr/lib/android-sdk
ENV PATH=$ANDROID_SDK_ROOT/cmdline-tools/latest/bin:$PATH
RUN yes | sdkmanager --licenses && \
sdkmanager "platform-tools" && \
sdkmanager "ndk-bundle" && \
sdkmanager "build-tools;33.0.0" "platforms;android-33"
Is there anything else I could do?
It eventually fails (after 33 minutes or so), with a space related error
INFO[0744] Taking snapshot of full filesystem...
error building image: error building stage: failed to take snapshot: write /kaniko/323538799: no space left on device
Cleaning up project directory and file based variables 00:01
ERROR: Job failed: exit code 1
@diego-roundev, the 60 min. timeout is the default value for GitLab to wait for a job execution. You don't need to play with that. If you have access to the gitlab runner settings, go and increase the available memory. To find out by how much, build your image locally with docker and check the image size. Make sure the gitlab runner can use more Mb/Gb than you image size.
I have the same issue on Gitlab CI/CD but only when cache is set to true
I have this problem too in Gitlab CI/CD
Hello everyone! I found solution here https://stackoverflow.com/questions/67748472/can-kaniko-take-snapshots-by-each-stage-not-each-run-or-copy-operation adding option to kaniko --single-snapshot
/kaniko/executor --context "${CI_PROJECT_DIR}" --dockerfile "${CI_PROJECT_DIR}/Dockerfile" --destination "${YC_CI_REGISTRY}/${YC_CI_REGISTRY_ID}/${CI_PROJECT_PATH}:${CI_COMMIT_SHA}" --single-snapshot
I have this problem too in Gitlab CI/CD
Same for me too
If it doesn't work, then may try adding --use-new-run and --snapshot-mode=redo All flags https://github.com/GoogleContainerTools/kaniko/blob/main/README.md# For mу it is working!
- mkdir -p /kaniko/.docker
- echo "{\"auths\":{\"${YC_CI_REGISTRY}\":{\"auth\":\"$(printf "%s:%s" "${YC_CI_REGISTRY_USER}" "${YC_CI_REGISTRY_PASSWORD}" | base64 | tr -d '\n')\"}}}" > /kaniko/.docker/config.json
- >-
/kaniko/executor
--context "${CI_PROJECT_DIR}"
--use-new-run
--snapshot-mode=redo
--dockerfile "${CI_PROJECT_DIR}/Dockerfile"
--destination "${YC_CI_REGISTRY}/${YC_CI_REGISTRY_ID}/${CI_PROJECT_PATH}:${CI_COMMIT_REF_SLUG}-${CI_COMMIT_SHA}"
I have the same issue. Is it a disk size issue?
we could fix the gitlab cicd pipeline error
Taking snapshot of full filesystem.... Killed
with
--compressed-caching=false
andv1.8.0-debug
. The image is around 2 GB. Alpine reported around 4 GB in around 100 packages.
I see this answer being lost in this thread, but it fixed the issue for me. Just pass this flag to Kaniko
--compressed-caching=false
--compressed-caching=false
it Is not available in the Skaffold schema for Kaniko. So I am trying to understand the root cause of this issue
俺也一样
俺也一样
添加--single-snapshot
参数试试。
I had this problem when trying to install terraform in an alpine linux image with the recommendations from this page https://www.hashicorp.com/blog/installing-hashicorp-tools-in-alpine-linux-containers
However the apk del .deps
command in the very last line triggered the issue. Presumably this changes a lot of files?
Actual behavior While building image using gcr.io/kaniko-project/executor:debug in gitlab CI runner hosted on kubernetes using helm chart the image build process freezes on Taking snapshot of full filesystem... for the time till the runner timeouts(1 hr) This behaviour is intermittent as for the same project image build stage works sometimes
Issue arises in multistage as well as single stage Dockerfile.
Expected behavior Image build should not freeze at
Taking snapshot of full filesystem...
and should be successful everytime.To Reproduce As the behaviour is intermittent not sure how it can be reproduced
--cache
flag@tejal29