kubernetes / minikube

Run Kubernetes locally
https://minikube.sigs.k8s.io/
Apache License 2.0
29.11k stars 4.86k forks source link

Kubernetes node seems to report its condition "OutOfDisk" quite often #282

Closed mumoshu closed 8 years ago

mumoshu commented 8 years ago

Hi, Thanks as always for developing/sharing minikube :)

Recently my kubernetes node running on minikube reports its condition "OutOfDisk" quite often. It is ok that I can't schedule any pod while it is "OutOfDisk", because it is how Kubernetes works.

The problem is that (1) my deployment seems pretty common and (2) I can't work around it.

For (1), I'm running 12 pods including mysql, memcached, elasticmq, postfix, nginx, my php apps, or etc which I believe is a common deployment for a web app thus I wish it could work without any problem.

For (2), restarting minikubeVM does fix the problem for a while(varies from several minutes to several hours) but not forever. Destroying/re-creating minikubeVM also fix the problem for several hours. Removing stopped containers and dangling docker images, removing old docker container logs doesn't work.


Now, It looks to me like minikubeVM has too small capacity for the root disk or using small tmpfs device to persist something(maybe big log/temp files?).

Would you mind gaining the default capacity or provide an option to customize it, or mount another non-tmpfs device somewhere?


Here is the output from kubectl describe no which does report the "OutOfDisk" condition:

Name:           127.0.0.1
Labels:         beta.kubernetes.io/arch=amd64
            beta.kubernetes.io/os=linux
            kubernetes.io/hostname=127.0.0.1
Taints:         <none>
CreationTimestamp:  Wed, 06 Jul 2016 12:38:55 +0900
Phase:
Conditions:
  Type          Status  LastHeartbeatTime           LastTransitionTime          Reason              Message
  ----          ------  -----------------           ------------------          ------              -------
  OutOfDisk         True    Thu, 07 Jul 2016 11:08:49 +0900     Thu, 07 Jul 2016 11:03:59 +0900     KubeletOutOfDisk        out of disk space
  MemoryPressure    False   Thu, 07 Jul 2016 11:08:49 +0900     Wed, 06 Jul 2016 12:38:55 +0900     KubeletHasSufficientMemory  kubelet has sufficient memory available
  Ready         True    Thu, 07 Jul 2016 11:08:49 +0900     Wed, 06 Jul 2016 12:38:55 +0900     KubeletReady            kubelet is posting ready status

And the output from df -h:

docker@minikubeVM:~$ sudo df -h
Filesystem                Size      Used Available Use% Mounted on
tmpfs                   896.3M    652.4M    243.9M  73% /
tmpfs                   497.9M   1016.0K    496.9M   0% /dev/shm
/dev/sda1                17.9G     10.7G      6.2G  63% /mnt/sda1
cgroup                  497.9M         0    497.9M   0% /sys/fs/cgroup
Users                   464.8G    170.4G    294.4G  37% /Users
/dev/sda1                17.9G     10.7G      6.2G  63% /mnt/sda1/var/lib/docker/aufs

AFAIK, Kubernetes node goes "OutOfDisk" when remaining capacity for the root disk or the docker disk goes under, by default, 256MB according to:

/dev/sda1 17.9G 10.7G 6.2G 63% /mnt/sda1 seems to be ok because its far larger than 256MB. tmpfs 896.3M 652.4M 243.9M 73% / seems to be problematic though.

Looking into /var/log revealed that my localkube.out is nearly 40MB:

docker@minikubeVM:~$ ls -lah /var/log
total 40832
drwxrwxr-x    3 root     staff        420 Jul  6 03:40 ./
drwxrwxr-x    8 root     staff        180 Jul  6 03:39 ../
-rw-r--r--    1 root     root           0 Jul  6 03:38 autologin
-rw-r--r--    1 root     root        5.8K Jul  6 03:38 boot2docker.log
drwxr-xr-x    2 root     root         600 Jul  7 03:12 containers/
lrwxrwxrwx    1 root     root          31 Jul  6 03:39 docker.log -> /var/lib/boot2docker/docker.log
-rw-r--r--    1 root     root       66.8K Jul  7 02:04 gcp-containers.log.pos
-rw-r--r--    1 root     root          54 Jul  6 03:40 gcp-docker.log.pos
-rw-r--r--    1 root     root          52 Jul  6 03:40 gcp-etcd.log.pos
-rw-r--r--    1 root     root          62 Jul  6 03:40 gcp-kube-apiserver.log.pos
-rw-r--r--    1 root     root          71 Jul  6 03:40 gcp-kube-controller-manager.log.pos
-rw-r--r--    1 root     root          62 Jul  6 03:40 gcp-kube-scheduler.log.pos
-rw-r--r--    1 root     root          55 Jul  6 03:40 gcp-kubelet.log.pos
-rw-r--r--    1 root     root          55 Jul  6 03:40 gcp-salt.pos
-rw-r--r--    1 root     root          61 Jul  6 03:40 gcp-startupscript.log.pos
-rw-r--r--    1 docker   staff        146 Jul  6 03:38 localkube.err
-rw-r--r--    1 docker   staff      39.7M Jul  7 05:07 localkube.out
-rw-rw-rw-    1 root     root           0 Jul  6 03:38 parallels.log
-rw-r--r--    1 root     root        6.0K Jul  7 05:06 udhcp.log
-rw-r--r--    1 root     root          28 Jul  6 03:39 userdata.log
-rw-rw-r--    1 root     staff       3.4K Jul  7 04:30 wtmp

so mounting another device at /var/log would free up the / a bit?

mumoshu commented 8 years ago

/var/lib/localkube seems be eating 500MB:

docker@minikubeVM:~$ sudo ls -1d /var/lib/localkube | grep -v 'mnt\|proc\|Users\|dev\|sys' | sudo xargs -I{} du {} -sh -d 3
122.1M  /var/lib/localkube/dns/member/wal
1.4M    /var/lib/localkube/dns/member/snap
123.5M  /var/lib/localkube/dns/member
123.5M  /var/lib/localkube/dns
366.2M  /var/lib/localkube/etcd/member/wal
11.0M   /var/lib/localkube/etcd/member/snap
377.2M  /var/lib/localkube/etcd/member
377.2M  /var/lib/localkube/etcd
8.0K    /var/lib/localkube/certs
500.6M  /var/lib/localkube

FYI, here is a collection of outputs from df & dd on my minikubeVM.: https://gist.github.com/mumoshu/e110a94f6b407671a888611ed50fee5f#file-minikube-282-2-txt

mumoshu commented 8 years ago

Btw, I'm using minikube v0.4.0. I will try v0.5.0 anyway but this issue doesn't seem be addressed in v0.5.0 according to the release note.

dlorenc commented 8 years ago

Hey,

We did fix an issue causing the logs to grow very quickly in 0.5. It's mentioned in the release notes here: Fixed a bug causing the minikube logs to fill up rapidly.

We should still consider logging to a location that's not mounted in tmpfs though.

Thanks for the report! Please let me know if 0.5 fixes this issue for you.

jimmidyson commented 8 years ago

Wonder if we also need to make the root partition bigger? If people are using volumes & writing to them I assume that will go into /var/lib/localkube somewhere which is in the very small root partition.

jimmidyson commented 8 years ago

Actually just noticed that localkube dir is symlink in /mnt/sda1 so that's fine.

Guess we could just move log files into localkube dir too?

dlorenc commented 8 years ago

Yeah we should move the log files into the dir.

mumoshu commented 8 years ago

@jimmidyson Thank you for pointing out the location of the localkube dir. Yes, it seems fine!

$ ls -lah /var/lib
total 0
drwxrwxr-x    5 root     staff        160 Jul 11 07:06 ./
drwxrwxr-x    8 root     staff        180 Jul 11 07:07 ../
lrwxrwxrwx    1 root     root          29 Jul 11 07:07 boot2docker -> /mnt/sda1/var/lib/boot2docker/
lrwxrwxrwx    1 root     root          24 Jul 11 07:07 docker -> /mnt/sda1/var/lib/docker/
drwxr-x---    4 root     root          80 Jul 11 07:06 kubelet/
lrwxrwxrwx    1 root     root          27 Jul 11 07:07 localkube -> /mnt/sda1/var/lib/localkube/
drwxr-xr-x    4 root     root         160 Jul 11 07:07 nfs/
drwxr-xr-x    2 root     root          40 Jul 11 07:07 sshd/

@dlorenc It has been almost a week since I've switched to v0.5.0 and the issue seems to be gone away. Thanks!

$ sudo df -h | head
Filesystem                Size      Used Available Use% Mounted on
tmpfs                   896.3M    336.7M    559.6M  38% /
tmpfs                   497.9M      1.3M    496.6M   0% /dev/shm
/dev/sda1                17.9G      5.6G     11.3G  33% /mnt/sda1
cgroup                  497.9M         0    497.9M   0% /sys/fs/cgroup
Users                   464.8G    192.7G    272.1G  41% /Users
/dev/sda1                17.9G      5.6G     11.3G  33% /mnt/sda1/var/lib/docker/aufs
none                     17.9G      5.6G     11.3G  33% /mnt/sda1/var/lib/docker/aufs/mnt/3a4413068cc1f8f4ee27eb5cb0658e2daeb9186b9fde6b2cb2f93ea2af06a3c1
shm                      64.0M         0     64.0M   0% /mnt/sda1/var/lib/docker/containers/13eda3b51b2d295753422a5511bb9116e3130ff024e80a5393a6c0002b9da841/shm
none                     17.9G      5.6G     11.3G  33% /mnt/sda1/var/lib/docker/aufs/mnt/7196af14d2c5228366626201cd6ed5ca2570a81632d9c5fc35b501899b0bdadf