kubernetes / minikube

Run Kubernetes locally
https://minikube.sigs.k8s.io/
Apache License 2.0
29.32k stars 4.88k forks source link

Set the --force-systemd true or false automatically (by detecting the cgroups) #8348

Open priyawadhwa opened 4 years ago

priyawadhwa commented 4 years ago

Look into if we should be setting --force-systemd=true by default, and if this results in any performance improvement

documentation says we need to use same as your system if your system uses systemd, you should use systemd

paddy-hack commented 4 years ago

How will this affect those Linux distributions that do not support/require systemd?

I for one, changed distros just to get rid of systemd, currently using Devuan with the docker-ce packages for the corresponding Debian distribution (buster).

This seems to work fine as long as I don't try to use Docker to run anything else after a minikube start. If I do, I see something like the following

$ docker run --rm -it alpine:3.12 /bin/sh
docker: Error response from daemon: cgroups: cannot find cgroup mount destination: unknown.

Before running minikube start the above docker invocation works just fine.

Actually, I cannot even stop and start minikube again :unamused:
Current work-around is a reboot.

For the record, I have cgroupfs-mount installed.

priyawadhwa commented 4 years ago

Hey @paddy-hack that's an interesting setup and would be important to explore before we set --force-systemd=true by default.

Just to clarify, this sets docker within the minikube VM to use systemd as cgroup manager (we already have systemd running in minikube).

Does running:

minikube start --force-systemd

work on your machine? And could you provide the output of docker info?

medyagh commented 4 years ago

@paddy-hack I agree with @priyawadhwa this would be for the systemd inside minikube but that is still a good point we need to ensure minikube is capable of running that cgroup inside that setup as well.

is there a way you can try and see if that doesnt work for you we can handle it on minikube?

paddy-hack commented 4 years ago

Replying to #6954, I had already gone through a

minikube start
minikube status
minikube stop
minikube start

but after that I got

paddy-hack@boson:~$ minikube start --force-systemd
😄  minikube v1.11.0 on Debian 10.0
✨  Using the docker driver based on existing profile
👍  Starting control plane node minikube in cluster minikube
🔄  Restarting existing docker container for "minikube" ...
🤦  StartHost failed, but will try again: driver start: start: docker start minikube: exit status 1
stdout:

stderr:
Error response from daemon: OCI runtime create failed: container with id exists: 53ac2f88bff8b8ea2db5cd4e9a3133ea9637cc8bd2e59c550008fba242ed74a7: unknown
Error: failed to start containers: minikube

🔄  Restarting existing docker container for "minikube" ...
😿  Failed to start docker container. "minikube start" may fix it: driver start: start: docker start minikube: exit status 1
stdout:

stderr:
Error response from daemon: OCI runtime create failed: container with id exists: 53ac2f88bff8b8ea2db5cd4e9a3133ea9637cc8bd2e59c550008fba242ed74a7: unknown
Error: failed to start containers: minikube

💣  error provisioning host: Failed to start host: driver start: start: docker start minikube: exit status 1
stdout:

stderr:
Error response from daemon: OCI runtime create failed: container with id exists: 53ac2f88bff8b8ea2db5cd4e9a3133ea9637cc8bd2e59c550008fba242ed74a7: unknown
Error: failed to start containers: minikube

😿  minikube is exiting due to an error. If the above message is not useful, open an issue:
👉  https://github.com/kubernetes/minikube/issues/new/choose

Restarting the docker service does not change this. I'll see what I get after a reboot to get things back to working order :nauseated_face:

Here's the docker info output.

paddy-hack@boson:~$ docker info
Client:
 Debug Mode: false

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 39
 Server Version: 19.03.11
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 4.19.0-9-amd64
 Operating System: Devuan GNU/Linux 3 (beowulf)
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 3.608GiB
 Name: boson
 ID: FFPZ:6IG2:WOZN:WC5L:ZZWQ:4VUO:BNKJ:UX6G:SYNW:ASKJ:GBCJ:VF5K
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
paddy-hack commented 4 years ago

Rebooted and tried again

paddy-hack@boson:~$ minikube start --force-systemd
😄  minikube v1.11.0 on Debian 10.0
✨  Using the docker driver based on existing profile
👍  Starting control plane node minikube in cluster minikube
🔄  Restarting existing docker container for "minikube" ...
🐳  Preparing Kubernetes v1.18.3 on Docker 19.03.2 ...
    ▪ kubeadm.pod-network-cidr=10.244.0.0/16
🔎  Verifying Kubernetes components...
🌟  Enabled addons: default-storageclass, storage-provisioner
🏄  Done! kubectl is now configured to use "minikube"
💡  For best results, install kubectl: https://kubernetes.io/docs/tasks/tools/install-kubectl/
paddy-hack@boson:~$ minikube status
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured

paddy-hack@boson:~$ minikube stop
✋  Stopping "minikube" in docker ...
🛑  Powering off "minikube" via SSH ...
🛑  Node "minikube" stopped.
paddy-hack@boson:~$ minikube start --force-systemd
😄  minikube v1.11.0 on Debian 10.0
✨  Using the docker driver based on existing profile
👍  Starting control plane node minikube in cluster minikube
🔄  Restarting existing docker container for "minikube" ...
🤦  StartHost failed, but will try again: driver start: start: docker start minikube: exit status 1
stdout:

stderr:
Error response from daemon: cgroups: cannot find cgroup mount destination: unknown
Error: failed to start containers: minikube

🔄  Restarting existing docker container for "minikube" ...
😿  Failed to start docker container. "minikube start" may fix it: driver start: start: docker start minikube: exit status 1
stdout:

stderr:
Error response from daemon: OCI runtime create failed: container with id exists: 53ac2f88bff8b8ea2db5cd4e9a3133ea9637cc8bd2e59c550008fba242ed74a7: unknown
Error: failed to start containers: minikube

💣  error provisioning host: Failed to start host: driver start: start: docker start minikube: exit status 1
stdout:

stderr:
Error response from daemon: OCI runtime create failed: container with id exists: 53ac2f88bff8b8ea2db5cd4e9a3133ea9637cc8bd2e59c550008fba242ed74a7: unknown
Error: failed to start containers: minikube

😿  minikube is exiting due to an error. If the above message is not useful, open an issue:
👉  https://github.com/kubernetes/minikube/issues/new/choose
paddy-hack commented 4 years ago

But getting back to this reliance on systemd, I ditched Debian (after two decades) and moved to Devuan to get rid of systemd. Seeing that minikube uses systemd, even considers forcing it upon me, makes me rethink whether I should be using minikube in the first place :thinking:

priyawadhwa commented 4 years ago

Hey @paddy-hack -- just to clarify, minikube does use systemd but only within the running VM or container (you don't need systemd on your machine).

The --force-systemd flag is used to make docker within the VM use systemd as the cgroup manager, as opposed to cgroupfs (systemd is running with or without that flag, that's how k8s comes up). Enabling the flag actually makes the kubernetes cluster more stable, as described in the k8s documentation here which is why we are considering setting it as the default.

In terms of the error you're getting from docker, it's a known docker issue on Linux:

https://github.com/docker/for-linux/issues/219

which a temporary solution mentioned in this comment:

https://github.com/docker/for-linux/issues/219#issuecomment-375160449

afbjorklund commented 4 years ago

The --force-systemd flag is used to make docker within the VM use systemd as the cgroup manager, as opposed to cgroupfs (systemd is running with or without that flag, that's how k8s comes up). Enabling the flag actually makes the kubernetes cluster more stable, as described in the k8s documentation here which is why we are considering setting it as the default.

The key thing here is to use the same cgroup driver. The minikube VM is using systemd, so then it makes sense to have Docker use systemd. If the host OS is using cgroupfs (not systemd), then it makes sense to have Docker use cgroupfs. The minikube settings are supposed to pick up whichever is in use, and forward this preference to the kubelet (since 0e83dd4b4e048d5c1c1b255476ccadd34ba8e3d2). So either should be fine...

afbjorklund commented 4 years ago

Also, currently systemd-in-systemd is broken in podman so it has no choice but to run cgroupfs...

pkg/drivers/kic/oci/oci.go-     // to run nested container from privileged container in podman https://bugzilla.redhat.com/show_bug.cgi?id=1687713
pkg/drivers/kic/oci/oci.go-     // only add when running locally (linux), when running remotely it needs to be configured on server in libpod.conf
pkg/drivers/kic/oci/oci.go-     if ociBin == Podman && runtime.GOOS == "linux" {
pkg/drivers/kic/oci/oci.go:             args = append(args, "--cgroup-manager", "cgroupfs")
pkg/drivers/kic/oci/oci.go-     }
afbjorklund commented 4 years ago

I tested with Devuan Beowulf.

Can confirm that trying to start minikube with the docker driver messes up docker (like above).

Probably something with the entrypoint /sys, that destroys some cgroup settings for cgroupfs ?

devuan@devuan:~$ docker logs minikube
INFO: ensuring we can execute /bin/mount even with userns-remap
INFO: remounting /sys read-only
INFO: making mounts shared
INFO: fix cgroup mounts for all subsystems
INFO: clearing and regenerating /etc/machine-id
Initializing machine ID from random generator.
INFO: faking /sys/class/dmi/id/product_name to be "kind"
INFO: faking /sys/class/dmi/id/product_uuid to be random
INFO: faking /sys/devices/virtual/dmi/id/product_uuid as well
INFO: setting iptables to detected mode: legacy
Inserted module 'autofs4'
systemd 242 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
Detected virtualization docker.
Detected architecture x86-64.
Failed to create symlink /sys/fs/cgroup/cpu: File exists
Failed to create symlink /sys/fs/cgroup/cpuacct: File exists
Failed to create symlink /sys/fs/cgroup/net_cls: File exists
Failed to create symlink /sys/fs/cgroup/net_prio: File exists

Welcome to Ubuntu 19.10!

Set hostname to <minikube>.

Run: docker run -d -t --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run -v /lib/modules:/lib/modules:ro --hostname minikube --name minikube --label created_by.minikube.sigs.k8s.io=true --label name.minikube.sigs.k8s.io=minikube --label role.minikube.sigs.k8s.io= --label mode.minikube.sigs.k8s.io=minikube --volume minikube:/var --cpus=2 --memory=2200mb -e container=docker --expose 8443 --publish=127.0.0.1::8443 --publish=127.0.0.1::22 --publish=127.0.0.1::2376 --publish=127.0.0.1::5000 gcr.io/k8s-minikube/kicbase:v0.0.10@sha256:f58e0c4662bac8a9b5dda7984b185bad8502ade5d9fa364bf2755d636ab51438


Beyond the extra "docker" layer, we also have some cgroups v2 compat created:

/sys/fs/cgroup/unified/init.scope

Anyway, since kicbase uses systemd (through KIND) it seems it fails on cgroupfs. Previously only cgroupfs-on-systemd was tested, not this systemd-on-cgroupfs...

As the article above implies, mixing and matching different init is asking for trouble. And I don't think we will provide a minikube.iso or a kicbase image without systemd.

So these systems (Devuan) will need to use --vm.

Or with a dedicated VM for it, maybe --driver none.

afbjorklund commented 4 years ago

If anyone wants to look into this further, the message is from containerd on /proc/self/mountinfo:

https://github.com/containerd/cgroups/blob/master/utils.go#L340

It doesn't seem so happy about the new "name=systemd" cgroup from /proc/self/cgroup ?

Similar to https://github.com/moby/moby/issues/38822


This also means that this is the workaround, to get Docker back (without reboot):

sudo mkdir /sys/fs/cgroup/systemd
sudo mount -t cgroup -o none,name=systemd,xattr cgroup /sys/fs/cgroup/systemd

If this is acceptable, then this is the way to run minikube with Docker-in-Docker

paddy-hack commented 4 years ago

Guess I'll be using qemu and minikube --vm then for the time being.

afbjorklund commented 4 years ago

@priyawadhwa :

just to clarify, minikube does use systemd but only within the running VM or container (you don't need systemd on your machine).

We should add a solution message, when trying to use docker driver without systemd cgroup.

The user doesn't actually need to run systemd as their PID 1 nor any daemons or units, though.

afbjorklund commented 4 years ago

Guess I'll be using qemu and minikube --vm then for the time being.

That should still work, note that you need libvirt (and not just QEMU/KVM)

paddy-hack commented 4 years ago

Thanks for the heads up on libvirt :bow:

afbjorklund commented 4 years ago

Thanks for the heads up on libvirt

At one point we considered renaming the driver from docker-machine-kvm to docker-machine-libvirt-driver, but at that point it was probably "too late" and the historical name won. Now forked as kvm2.

The qemu (with kvm) driver has some issues with creating the networks for kubernetes, so it works better in a simpler docker context. So that's why we are using the (system) libvirt wrapper instead...

https://libvirt.org/drvqemu.html

The "qemu:///system" family of URIs connect to a libvirtd instance running as the privileged system account 'root'. Thus the QEMU instances spawned from this driver may have much higher privileges than the client application managing them. The intended use case for this driver is server virtualization, where the virtual machines may need to be connected to host resources (block, PCI, USB, network devices) whose access requires elevated privileges.

It should warn about it. (#5617)

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

medyagh commented 3 years ago

we need to figure out the mac os user's best default by finding out Docker's implemenation of their VM. if their VM is using cgroup or systemd.

and for github actions minikube should autodetect it is using github actions ( there is an environment varialble)

medyagh commented 3 years ago

@afbjorklund susggests enabling kuberentes on docker on dekstop and see what why are using.

medyagh commented 3 years ago

maybe we can exec into the Docker machine created by docker desktop and see what cgroup it uses

https://gist.github.com/BretFisher/5e1a0c7bcca4c735e716abf62afad389
fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

prezha commented 3 years ago

/remove-lifecycle stale

sharifelgamal commented 3 years ago

@govargo would you be interested in looking at this?

govargo commented 3 years ago

It may take some times because I'm not good at this point. But I'll try to look this from tomorrow.

jiangxiaobin96 commented 1 year ago

HI, I want to ask when I use systemd.SdNotify to confirm weather running on a systemd system, does command minikube start --force --driver=docker --force-systemd work and return without err? I try and find it does not work, so what to set can pass systemd.SdNotify?