cloudfoundry-incubator / kubo-deployment

Contains manifests used to deploy Cloud Foundry Container Runtime
https://www.cloudfoundry.org/container-runtime/
Apache License 2.0
275 stars 114 forks source link

Wrong filesystem capacity detected #351

Closed hugochinchilla closed 5 years ago

hugochinchilla commented 5 years ago

What happened:

I'm having problems with my workers reporting the wrong ammount of disk capacity on the vm, I'm getting pod evictions while having a lot of free space on the ephemeral disk. The VM has two disks, sda1 with the system install and sdb2 for ephemeral data.

kubelet seems to detect the size of sda1 as the ammount of space available for ephemeral storage.

$ kubectl describe node
...
Capacity:
 cpu:                4
 ephemeral-storage:  3030944Ki
 hugepages-2Mi:      0
 memory:             32940952Ki
 pods:               110
...

take the 3030944Ki and convert it to bytes, you get 3103686656, searching for this number on the kubelet logs I can see it is the exact size of the system partition (sda1), docker is running with --graph /var/vcap/data/docker/docker which is sdb2 not sda1.

$ ps aux | grep dockerd
dockerd --bridge=cni0 --debug=false --default-ulimit=nofile=65536 --group vcap --graph /var/vcap/data/docker/docker --host unix:///var/vcap/sys/run/docker/docker.sock --icc=true --ip-forward=true --ip-masq=false --iptables=false --ipv6=false --log-level=error --log-opt=max-size=128m --log-opt=max-file=2 --mtu=1450 --pidfile /var/vcap/sys/run/docker/docker.pid --selinux-enabled=false --storage-driver=overlay2 --host tcp://127.0.0.1:4243 --userland-proxy=true

Here is the output of df (redacted):

$ df 

Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sda1        3030944 2468276    391096  87% /
/dev/sdb2       20507260 4782664  14659844  25% /var/vcap/data

And the relevant section from the kubelet log:

I1016 08:14:33.044417    7365 server.go:526] Successfully initialized cloud provider: "vsphere" from the config file: ""
I1016 08:14:33.044431    7365 server.go:772] cloud provider determined current node name to be 312c6d1e-82f3-470d-bf10-b5a04a66a0f4
I1016 08:14:33.047753    7365 manager.go:154] cAdvisor running in container: "/sys/fs/cgroup/cpu,cpuacct"
I1016 08:14:33.050821    7365 fs.go:142] Filesystem UUIDs: map[516a9d28-f7e2-4e30-b208-8b3473bcb46e:/dev/sda1 88894553-2ba1-45ba-8b20-9c93f150a743:/dev/sdb1 ca728db0-2dcc-4000-ace7-32d408f150f9:/dev/sdb2]
I1016 08:14:33.050843    7365 fs.go:143] Filesystem partitions: map[/dev/sda1:{mountpoint:/ major:8 minor:1 fsType:ext4 blockSize:0} /dev/sdb2:{mountpoint:/var/vcap/data major:8 minor:18 fsType:ext4 blockSize:0} tmpfs:{mountpoint:/run major:0 minor:22 fsType:tmpfs blockSize:0}]
I1016 08:14:33.054683    7365 manager.go:227] Machine: {NumCores:4 CpuFrequency:2799999 MemoryCapacity:33731534848 HugePages:[{PageSize:2048 NumPages:0}] MachineID:d546192840e7281040ccf3722d167fb7 SystemUUID:4213FECA-BE7C-8906-717A-8938D6F23FA0 BootID:ff5f408d-70dd-49e4-94aa-31ad22282f5a Filesystems:[{Device:tmpfs DeviceMajor:0 DeviceMinor:22 Capacity:3373154304 Type:vfs Inodes:4117619 HasInodes:true} {Device:/dev/sda1 DeviceMajor:8 DeviceMinor:1 Capacity:3103686656 Type:vfs Inodes:195840 HasInodes:true} {Device:/dev/sdb2 DeviceMajor:8 DeviceMinor:18 Capacity:20999434240 Type:vfs Inodes:1310720 HasInodes:true}] DiskMap:map[8:0:{Name:sda Major:8 Minor:0 Size:3221225472 Scheduler:cfq} 8:16:{Name:sdb Major:8 Minor:16 Size:21474836480 Scheduler:cfq}] NetworkDevices:[{Name:eth0 MacAddress:00:50:56:93:f7:5a Speed:10000 Mtu:1500}] Topology:[{Id:0 Memory:33731534848 Cores:[{Id:0 Threads:[0] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:262144 Type:Unified Level:2}]}] Caches:[{Size:26214400 Type:Unified Level:3}]} {Id:2 Memory:0 Cores:[{Id:0 Threads:[1] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:262144 Type:Unified Level:2}]}] Caches:[{Size:26214400 Type:Unified Level:3}]} {Id:4 Memory:0 Cores:[{Id:0 Threads:[2] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:262144 Type:Unified Level:2}]}] Caches:[{Size:26214400 Type:Unified Level:3}]} {Id:6 Memory:0 Cores:[{Id:0 Threads:[3] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:262144 Type:Unified Level:2}]}] Caches:[{Size:26214400 Type:Unified Level:3}]}] CloudProvider:Unknown InstanceType:Unknown InstanceID:None}

What you expected to happen:

kubelet to detect sdb2 as the correct storage for ephemeral data.

How to reproduce it (as minimally and precisely as possible):

Deploy cfcr on a vsphere cluster:

bosh -e bosh-1 deploy -d cfcr manifests/cfcr.yml \
  -o manifests/ops-files/misc/single-master.yml \
  -o manifests/ops-files/enable-bbr.yml \
  -o manifests/ops-files/add-hostname-to-master-certificate.yml \
  -o manifests/ops-files/iaas/vsphere/use-vm-extensions.yml \
  -o manifests/ops-files/iaas/vsphere/cloud-provider.yml \
  -o manifests/ops-files/disable-security-context-deny.yml \
  -v api-hostname=ss.kube.habitissimo.net \
  -v vcenter_master_user=admin \
  -v vcenter_master_password=jyKgcs4O \
  -v vcenter_ip=10.58.39.2 \
  -v vcenter_dc="Interxion MAD2" \
  -v vcenter_ds=habitissimo_premium \
  -v vcenter_vms=bosh-1-vms \
  -v director_uuid=9e8ffae0-1a12-400a-8346-486e7b2e08be

bosh -e bosh-1 -d cfcr run-errand apply-specs
bosh -e bosh-1 -d cfcr run-errand smoke-tests

Get the description of a worker node with kubectl describe node, search for ephemeral-storage under Capacity.

Anything else we need to know?:

Environment:

Name  Release(s)       Stemcell(s)                                     Config(s)        Team(s)  
cfcr  bosh-dns/1.10.0  bosh-vsphere-esxi-ubuntu-xenial-go_agent/97.18  4 cloud/default  -  
      bpm/0.12.3                                                                          
      cfcr-etcd/1.5.0                                                                     
      docker/32.0.0                                                                       
      kubo/0.22.0   
Name      bosh-1  
UUID      9e8ffae0-1a12-400a-8346-486e7b2e08be  
Version   268.0.1 (00000000)  
CPI       vsphere_cpi  
Features  compiled_package_cache: disabled  
          config_server: enabled  
          dns: disabled  
          snapshots: disabled  
User      admin  
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"archive", BuildDate:"2018-08-20T08:45:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.3", GitCommit:"a4529464e4629c21224b3d52edfe0ea91b072862", GitTreeState:"clean", BuildDate:"2018-09-09T17:53:03Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
cf-gitbot commented 5 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/161254182

The labels on this github issue will be updated when the story is started.

hugochinchilla commented 5 years ago

may be related to kubernetes/kubernetes#66961

hugochinchilla commented 5 years ago

Ok, I think I've found the problem.

Kubelet is running with /var/lib/kubelet as root dir, I think it should be using /var/vcap/data/kubelet instead.

seanos11 commented 5 years ago

closing based on resolution of https://github.com/cloudfoundry-incubator/kubo-release/pull/259

cf-gitbot commented 5 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/162153525

The labels on this github issue will be updated when the story is started.

seanos11 commented 5 years ago

reopened as it is still under review

seanos11 commented 5 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/162153525

The labels on this github issue will be updated when the story is started.

my close/reopen created a duplicate tracker item, I have deleted 162153525, https://www.pivotaltracker.com/story/show/161254182 is the one to track

instantlinux commented 5 years ago

I would like this error message improved:

Jan  8 01:07:30 vinson kubelet[1514]: I0108 01:07:30.586617    1514 image_gc_manager.go:300] [imageGCManager]:
 Disk usage on image filesystem is at 85% which is over the high threshold (85%). Trying to free 481610956 bytes down to the low threshold (80%).

It doesn't give device name or mount point so I can't figure out which filesystem it's complaining about (there is plenty of space on mounted volumes, so perhaps it's looking at the thinpool LVM which I'm using for image storage).

Also, the garbage-collector barfs when it encounters statically-launched images started by docker run rather than k8s.

tvs commented 5 years ago

@instantlinux That seems more within the purview of the Kubernetes community. Please raise an issue there.

@hugochinchilla This should be fixed in the default manifest as of CFCR v0.31.0 (Kubelet's root-dir is set to /var/vcap/data/kubelet)

instantlinux commented 5 years ago

Sure thing @tvs, thanks for reminding me. I've reported there as issue #75708.

hugochinchilla commented 5 years ago

Thank's for the update @tvs