Open slamer59 opened 2 years ago
May be memory limit is the main problem. I got this before too~
I'm having the same issue, pipeline constantly failing and I'm not sure but I think the snapshot is being run on memory and while running on Kubernetes I don't think that would be the best idea as memory is a limited resource specially in this context.
In my case, I need to install some packages and they amount to ~1.1GB plus the already existing data (Alpine based image). But considering several node applications, 1GB unfortunately does not seem a lot as final image.
Running on GKE with Autopilot and Gitlab Helm - memory is capped at 2Gi right now which should be plenty except for this pipeline. Is it possible to disable snapshot? or how to make it use disk instead?
This seems related to: https://github.com/GoogleContainerTools/kaniko/issues/909
We're having this issue as well with 1.9.1-debug
. End size of the image should be ~9GB, but the kaniko build (on GKE) fails due to limit in memory. See attached image to share in my agony.
I am having the same issues repeatedly (running on gitlab-ci pipelines with eks, memory limits in place) ... the thing is no matter how much memory I give the job it uses everything it gets.
Here are some screenshots on the same job with different reservations/limits:
at least the last one did not fail but both others failed at taking snapshots resulting container size is approximately 280MB
after I ran across with some further issues I noticed this flag: https://github.com/GoogleContainerTools/kaniko#flag---compressed-caching which needs to be set to 'false' on OOM-errors setting that flag on my side resulted in no OOM termination (yet) but scratching the maximum allocateable memory
Actual behavior The kaniko build silently crashes after taking the full filesystem snapshot with no useful error. Works fine with dind. Disabling the Kaniko cache doesn't help.
Might be related to
2249
1333
Expected behavior Running gitlab CI with kaniko
Leads to this error:
I don't know how to keep Job in k8s so... I cannot see what happend (I will look on how to keep failing ones)
To Reproduce Steps to reproduce the behavior:
docker-build: stage: docker-build rules:
lead to a failure.
I can run again the same configuration and it work some time. Here an exemple on docker-build [production]
Additional Information
Kaniko Image (fully qualified with digest)
Triage Notes for the Maintainers
--cache
flag