openshift / os

90 stars 107 forks source link

Missing /sys/fs/cgroup/cpuacct,cpu #145

Closed ashcrow closed 6 years ago

ashcrow commented 6 years ago

@crawford has found in tests that /sys/fs/cgroup/cpuacct,cpu is being expected during his testing but RHCOS provides /sys/fs/cgroup/cpu,cpuacct.

https://github.com/kubernetes/kubernetes/issues/32728#issuecomment-252469277 denotes a similar issue. The workaround is to setup a link from one to the other.

crawford commented 6 years ago

Note that this reversal only happens within docker. Outside of docker, I see /sys/fs/cgroup/cpuacct,cpu.

ashcrow commented 6 years ago

@crawford interesting. Thanks for adding that.

@derekwaynecarr does this truly look like the same issue you had seen before?

ashcrow commented 6 years ago

@crawford / @derekwaynecarr:

Do we have a good understanding of how hard this will be to fix? I know @derekwaynecarr noted he's looked at this before and thought it has been fixed already.

ashcrow commented 6 years ago

To notify those on this issue work on trying to identify the issue has started.

Bubblemelon commented 6 years ago

Here's what I found so far with Docker version 1.13.1 RHEL7 to see if the error persists there:

Setup

$ vagrant box add --name RHCOS rhcos-vagrant-libvirt.box 
$ mdkir rhcos && cd rhcos && vagrant init RHCOS && vagrant up
$ vagrant ssh

Link to Vagrant box binary: http://aos-ostree.rhev-ci-vms.eng.rdu2.redhat.com/rhcos/images/cloud/latest/

RPM Overlaying

$ sudo ostree admin unlock --hotfix
$ rpm -qa | grep docker 

Docker version 1.13.1-70 RHEL7

$ sudo rpm-ostree override replace *.rpm  
$ sudo rpm-ostree status -v
$ sudo reboot

Ran the following commands to start the Kublet:

Commands source

/usr/bin/docker \
    run \
      --rm \
      --net host \
      --pid host \
      --privileged \
      --volume /dev:/dev:rw \
      --volume /sys:/sys:ro \
      --volume /var/run:/var/run:rw \
      --volume /var/lib/cni/:/var/lib/cni:rw \
      --volume /var/lib/docker/:/var/lib/docker:rw \
      --volume /var/lib/kubelet/:/var/lib/kubelet:shared \
      --volume /var/log:/var/log:shared \
      --volume /etc/kubernetes:/etc/kubernetes:ro \
      --entrypoint /usr/bin/hyperkube \
    "openshift/origin-node" \
      kubelet \
        --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
        --kubeconfig=/var/lib/kubelet/kubeconfig \
        --rotate-certificates \
        --cni-conf-dir=/etc/kubernetes/cni/net.d \
        --cni-bin-dir=/var/lib/cni/bin \
        --network-plugin=cni \
        --lock-file=/var/run/lock/kubelet.lock \
        --exit-on-lock-contention \
        --pod-manifest-path=/etc/kubernetes/manifests \
        --allow-privileged \
        --node-labels=node-role.kubernetes.io/master \
        --minimum-container-ttl-duration=6m0s \
        --cluster-dns=10.3.0.10 \
        --cluster-domain=cluster.local \
        --client-ca-file=/etc/kubernetes/ca.crt \
        --anonymous-auth=false \
        --register-with-taints=node-role.kubernetes.io/master=:NoSchedule \

Which gave me the following output:

Flag --pod-manifest-path has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --allow-privileged has been deprecated, will be removed in a future version
Flag --minimum-container-ttl-duration has been deprecated, Use --eviction-hard or --eviction-soft instead. Will be removed in a future version.                                                                   
Flag --cluster-dns has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Flag --cluster-domain has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/
for more information.
Flag --anonymous-auth has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/
for more information.
I0705 23:32:18.768616    2186 feature_gate.go:230] feature gates: &{map[]}
I0705 23:32:18.768902    2186 feature_gate.go:230] feature gates: &{map[]}
I0705 23:32:19.036018    2186 server.go:415] Version: v1.11.0+d4cacc0
I0705 23:32:19.036062    2186 feature_gate.go:230] feature gates: &{map[]}
I0705 23:32:19.036110    2186 feature_gate.go:230] feature gates: &{map[]}
I0705 23:32:19.036123    2186 server.go:493] acquiring file lock on "/var/run/lock/kubelet.lock"
I0705 23:32:19.036150    2186 server.go:498] watching for inotify events for: /var/run/lock/kubelet.lock
I0705 23:32:19.036262    2186 plugins.go:97] No cloud provider specified.
W0705 23:32:19.036290    2186 server.go:556] standalone mode, no API client
F0705 23:32:19.036300    2186 server.go:262] failed to run Kubelet: No authentication method configured

So it looks the error with cgroup isn't showing with this docker version, unless my reproduction steps are incorrect.

Updated: look at comments below, the tests stated in this comment was insufficient to identify the problem

Bubblemelon commented 6 years ago

I've encountered an error when Mounting NFS shared folders, i.e. at /vagrant and running exportfs -a -v doesn't change anything. @cgwalters may have already fixed this error as suggested by @peterbaouoft

The full error log in this gist.

ashcrow commented 6 years ago

@Bubblemelon this makes me wonder if the fix was applied at build time via a patch. It may be worth using rpmdev-extract to take a look at the contents of the SRPM and see what patches (if any) are applied.

Bubblemelon commented 6 years ago

Using this Libvirt howto guide to verify the assumptions in my above comment about docker's cgroup driver:

Master Node Info

RHCOS version: source

[core@coreos-220-master-0 ~]$ rpm-ostree status -v
State: idle; auto updates disabled
Deployments:
● ostree://rhcos:openshift/3.10/x86_64/os
                   Version: 3.10-7.5.235 (2018-07-06 22:41:39)
                    Commit: f51faab9a702e0d85905f3edc81641a63c9ec3c8acf0319e52d03de03de67e5f
                            └─ atomic-centos-continuous (2018-07-06 20:45:09)
                            └─ dustymabe-ignition (2018-07-03 00:29:34)
                            └─ rhcos-continuous (2018-07-06 19:25:38)
                            └─ rhel-7.5-server (2018-05-02 10:10:39)
                            └─ rhel-7.5-server-optional (2018-05-02 10:06:54)
                            └─ rhel-7.5-server-extras (2018-05-02 13:57:35)
                            └─ rhel-7.5-atomic (2017-07-11 17:45:34)
                            └─ openshift (2018-07-06 21:46:14)
                    Staged: no
                 StateRoot: rhcos

Docker Version: 2018-04-30 15:56:58

[core@coreos-220-master-0 ~]$ rpm -qa | grep docker
docker-client-1.13.1-63.git94f4240.el7.x86_64
docker-rhel-push-plugin-1.13.1-63.git94f4240.el7.x86_64
docker-common-1.13.1-63.git94f4240.el7.x86_64
docker-1.13.1-63.git94f4240.el7.x86_64
docker-novolume-plugin-1.13.1-63.git94f4240.el7.x86_64
docker-lvm-plugin-1.13.1-63.git94f4240.el7.x86_64

Output from $ journalctl -u docker:

Jul 09 17:51:20 coreos-220-master-0 dockerd-current[1145]: F0709 17:51:20.232817   25049 server.go:262] failed to run Kubelet: 
failed to create kubelet: misconfiguration: 
kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"                                                                                                                                                                        

Jul 09 17:51:20 coreos-220-master-0 dockerd-current[1145]: time="2018-07-09T17:51:20.267459141Z" level=error msg="containerd: 
deleting container" error="exit status 1: \"container b85300b2eee4b379bec5753361f37e1
1bcb8cacdd7c4aa6c9179d62eb93ab001 does not exist\\none or more of the container deletions failed\\n\""                                                                                                             

Jul 09 17:51:20 coreos-220-master-0 dockerd-current[1145]: time="2018-07-09T17:51:20.298990686Z" level=warning msg="b85300b2eee4b379bec5753361f37e11bcb8cacdd7c4aa6c9179d62eb93ab001 cleanup: failed to unmount sec
rets: invalid argument"

In trying to resolve cgroupfs is different from docker cgroup driver: systemd error:

I found this openshift issue #18776:

To place

ExecStart=/usr/bin/dockerd \
          --exec-opt native.cgroupdriver=systemd 

within docker.service. However the /usr directory is read only and docker.service already contains the following:

[core@coreos-220-master-0 system]$ cat docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network.target rhel-push-plugin.socket registries.service
Wants=docker-storage-setup.service
Requires=rhel-push-plugin.socket registries.service
Requires=docker-cleanup.timer

[Service]
Type=notify
NotifyAccess=all
EnvironmentFile=-/run/containers/registries.conf
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network
Environment=GOTRACEBACK=crash
Environment=DOCKER_HTTP_HOST_COMPAT=1
Environment=PATH=/usr/libexec/docker:/usr/bin:/usr/sbin
ExecStart=/usr/bin/dockerd-current \
          --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current \
          --default-runtime=docker-runc \
          --authorization-plugin=rhel-push-plugin \
          --exec-opt native.cgroupdriver=systemd \         <--------------
          --userland-proxy-path=/usr/libexec/docker/docker-proxy-current \
          --init-path=/usr/libexec/docker/docker-init-current \
          --seccomp-profile=/etc/docker/seccomp.json \
          $OPTIONS \
          $DOCKER_STORAGE_OPTIONS \
          $DOCKER_NETWORK_OPTIONS \
          $ADD_REGISTRY \
          $BLOCK_REGISTRY \
          $INSECURE_REGISTRY \
          $REGISTRIES
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
TimeoutStartSec=0
Restart=on-abnormal
KillMode=process

[Install]
WantedBy=multi-user.target
cgwalters commented 6 years ago

Related: https://github.com/coreos/bugs/issues/1435

Bubblemelon commented 6 years ago

The error above

failed to create kubelet: misconfiguration: 
kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd" 

can be resolved by adding --cgroup-driver=systemd \ to kubelet.service:

[Unit]
Description=Kubernetes Kubelet
...

[Service]
...

ExecStart=/usr/bin/docker \
  run \
    .
    .
  "openshift/origin-node:latest" \
    kubelet \
      .
      .
      . 
      --cgroup-driver=systemd \

After running, sudo systemctl daemon-reload && sudo systemctl restart kubelet:

journalctl -u docker and journalctl -u kubelet shows the same output:

kubelet.go:1769] skipping pod synchronization - [container runtime is down]                                                       
kubelet_node_status.go:269] Setting node annotation to enable volume controller attach/detach                                     
kubelet.go:1312] Failed to start cAdvisor inotify_add_watch /sys/fs/cgroup/cpuacct,cpu: no such file or directory                 
kubelet.service: main process exited, code=exited, status=255/n/a                                                                                                 
Unit kubelet.service entered failed state.
kubelet.service failed.
kubelet_node_status.go:79] Attempting to register node coreos-220-master-0                                                       
kubelet.go:1312] Failed to start cAdvisor inotify_add_watch /sys/fs/cgroup/cpuacct,cpu: no such file or directory                
kubelet.service: main process exited, code=exited, status=255/n/a                                                                                                 
Unit kubelet.service entered failed state.
kubelet.service failed.
$ rpm -qa | grep docker
docker-client-1.13.1-63.git94f4240.el7.x86_64
docker-rhel-push-plugin-1.13.1-63.git94f4240.el7.x86_64
docker-common-1.13.1-63.git94f4240.el7.x86_64
docker-1.13.1-63.git94f4240.el7.x86_64
docker-novolume-plugin-1.13.1-63.git94f4240.el7.x86_64
docker-lvm-plugin-1.13.1-63.git94f4240.el7.x86_64
ashcrow commented 6 years ago

Great work debugging @Bubblemelon!

Bubblemelon commented 6 years ago

Also thank you @crawford for helping me!

Just to clarify, something on the kubelet side is causing the Failed to start cAdvisor inotify_add_watch /sys/fs/cgroup/cpuacct,cpu: no such file or directory

I've also tried it out with this docker version: source - Sun, 08 Jul 2018 09:39:40 UT

docker-1.13.1-72.git6f36bd4.el7.x86_64
docker-rhel-push-plugin-1.13.1-72.git6f36bd4.el7.x86_64
docker-client-1.13.1-72.git6f36bd4.el7.x86_64
docker-lvm-plugin-1.13.1-72.git6f36bd4.el7.x86_64
docker-common-1.13.1-72.git6f36bd4.el7.x86_64
docker-novolume-plugin-1.13.1-72.git6f36bd4.el7.x86_64

Which gave the same error.

Bubblemelon commented 6 years ago

Like to note that openshift/origin-node:latest i.e. openshift v3.11.0-alpha.0+90e2736-260 is running Kubernetes v1.11.0+d4cacc0.

That version of kubelet should include this fix

Bubblemelon commented 6 years ago

@derekwaynecarr what are your thoughts on this?

mrunalp commented 6 years ago

cadivor doesn't like /sys:/sys:ro. See https://github.com/google/cadvisor/issues/1843

Bubblemelon commented 6 years ago

This same error,

kubelet.go:1312] Failed to start cAdvisor inotify_add_watch /sys/fs/cgroup/cpuacct,cpu: no such file or directory

Still occurs when /sys is changed to read and write within the kubelet.service file.

.
.
ExecStart=/usr/bin/docker \
  run \
 .
 .
 --volume /sys:/sys:rw \
.

Note that on RHCOS, the file is in this format: /sys/fs/cgroup/cpu,cpuacct

If both of these were added, under ExecStart=/usr/bin/docker \

    --volume /sys:/sys:rw \
    --volume=/sys/fs/cgroup/cpu,cpuacct:/sys/fs/cgroup/cpuacct,cpu:rw \

This error would occur:

kubelet.service holdoff time over, scheduling restart.
Starting Kubernetes Kubelet...
Started Kubernetes Kubelet.
container_linux.go:247: starting container process caused "process_linux.go:364: container init caused 
\"rootfs_linux.go:54: mounting \\\"/sys/fs/cgroup/cpu,cpuacct\\\" to rootfs 
\\\"/var/lib/docker/overlay2/8c95a16f4cad1f014091093c62248c6c0f27bcde879606cef6220f7db4521708/
merged\\\" at \\\"/var/lib/docker/overlay2/8c95a16f4cad1f014091093c62248c6c0f27bcde879606cef6220f7db4521708/
merged/sys/fs/cgroup/cpuacct,cpu\\\" caused \\\"no space left on device\\\"\""
 /usr/bin/docker-current: Error response from daemon: oci runtime error: Failed to remove paths: 
map[cpu:/sys/fs/cgroup/cpu,cpuacct/system.slice/docker-afc3a2d6c323ed28a6c7e6586239cb4db8b79b591513eb229ca6fa1eb0bead3b.scope 
cpuacct:/sys/fs/cgroup/cpu,cpuacct/system.slice/docker-afc3a2d6c323ed28a6c7e6586239cb4db8b79b591513eb229ca6fa1eb0bead3b.scope].
ashcrow commented 6 years ago

@crawford do you mind stating what priority you think this should have? Or if the workaround in use should be applied in the RHCOS spins itself? This would clarify if @Bubblemelon and @mrunalp should keep digging on this specific issue.

crawford commented 6 years ago

This needs to be fixed in the Kubelet. If the OS team is going to tackle that, then I think this bug should stay. Otherwise, let's close this and let @derekwaynecarr and his team tackle the issue. Either way, this is a low priority. I have a workaround (it's ugly, but it works).

ashcrow commented 6 years ago

Since this is kubelet related we should pass it over to @derekwaynecarr's team and link back to this issue so they don't have to re-do all of the good debugging done so far.

Bubblemelon commented 6 years ago

Moved this issue over to openshift/origin

ashcrow commented 6 years ago

Closing since the fix must be done in another codebase.