Azure / ACS

Azure Container Service - Bug Tracker + Announcements
65 stars 27 forks source link

ACS run out of disk inode, but k8s GC is not working #21

Open guesslin opened 7 years ago

guesslin commented 7 years ago

Sometimes I got error while k8s pulling new images from Docker Registry like:

Failed to pull image "<IMAGE>:latest": failed to register layer: mkdir /var/lib/docker/overlay/545c2788d40c8155df608067746dd90f4e570cc5f406ce968b6e063a1a68917c/tmproot327958565/usr/local/go/test/fixedbugs/bug468.dir: no space left on device

But vm disk does have free space, but no inode.

look like related to https://github.com/moby/moby/issues/10613

Does azure support any command for k8s type acs for quickly solve this problem?

JackQuincy commented 7 years ago

Sorry for slow reply. If you look at ACS-Engine issues others are running into this same issue and we are working on a fix. To my knowledge we don't have any easy command out of box outside of sshing into the node in question and cleaning up the old images. Normally this has happened when the kubelet enters a crash loop.

IainColledge commented 7 years ago

In case anyone reads this, I use this workaround to avoid, basically increase the nodes OS disk capacity to about 120GB.

https://stackoverflow.com/questions/44915426/increasing-disk-space-for-agents-worker-nodes-in-azure-container-service