coreos / coreos-kubernetes

CoreOS Container Linux+Kubernetes documentation & Vagrant installers
https://coreos.com/kubernetes/docs/latest/
Apache License 2.0
1.1k stars 466 forks source link

Kubelet dependencies on host CLI tools #287

Open robszumski opened 8 years ago

robszumski commented 8 years ago

The kubelet calls out to various CLI tools are aren't shipped in CoreOS...things like rbd.

Open questions:

From discussion on IRC, the conformance tests don't really touch this topic. Do we need to do some work upstream on that as well?

cc: @pbx0 @mikedanese

mikedanese commented 8 years ago

rbd, iscsiadm seem like the big ones. cc @saad-ali

peebs commented 8 years ago

It seems like we'll need to ship these with our kubelet containers since we don't want OS auto-updates to upgrade these packages underneath the kubelet. How does kubernetes currently test these packages? Also for a given kubernetes release, how would one find out which versions of these CLI tools were tested?

We try to keep our hyperkube builds as close to upstream as possible. Right now we just add a /kubelet symlink to /hyperkube -- kubelet to make it work like a plain kubelet container.

Mike, you mentioned it would be hacky to ship the kubelet in a container but not use --containerized. We currently are doing this. What might be the consequences of this?

Also, in IRC it was mentioned that https://github.com/kubernetes/kubernetes/pull/10176 is related.

saad-ali commented 8 years ago

I'm not sure about non-volume plugin CLI tools, but regarding CLI tool dependencies for volume plugins:

I don't think it makes sense for the underlying OS to ship with all possible binaires required for all volume plugins that k8s supports. At the moment, any volume plugin specific dependencies are the responsibility of the cluster admin to install on nodes. If a cluster admin wants to enable rbd volume plugin support, for example, the instructions are to first install ceph on the nodes. We'd like to improve the UX around this, but I don't think the right answer is for the underlying OS to ship with all possible volume plugin CLI tools.

aaronlevy commented 8 years ago

FWIW when we talk about running the kubelet in a container, we might be doing this slightly differently than others. Namely using "rkt-fly". More or less running the kubelet in an unconstrained chroot, but shipped via an ACI / Docker image: See: https://github.com/coreos/coreos-overlay/blob/master/app-admin/kubelet-wrapper/files/kubelet-wrapper (we don't use --containerized nor mount / --> /rootfs)

So we should be able to include any CLI tools in the hyperkube image if it seems reasonable. But sounds like maybe (in the case of volume plugins) it should still be left up to cluster admins? If that's the case, we may want to demonstrate how this could be accomplished on CoreOS.

mikedanese commented 8 years ago

I don't really have any context on rocket but kubelet swaps out a mounter and writer if it's told it's running in a container. I haven't done a full audit of their usage but the main issue is mounting volumes. Mounts need to propagate form the kubelets mount namespace to the host then back to other containers. If the secrets e2e is working then you should be fine. Does kubelet run in host pid and net? Does rkt api support mount propagation/shared subtrees?

https://github.com/kubernetes/kubernetes/blob/master/cmd/kubelet/app/server.go#L134-L138

mikedanese commented 8 years ago

Or running in host mnt

mischief commented 8 years ago

cc @alban, any idea about mount propagation?

mikedanese commented 8 years ago

Also kubelet needs the host's view of dev (and not a snapshot like docker run --device=[] as block devices will be hotplugged).

alban commented 8 years ago

The kubelet started by rkt-fly (kubelet-wrapper) runs in the host namespaces (host mnt, host pid, host net, etc.) but in a chroot. If kubelet-wrapper was using a --volume for / -> /rootfs, rkt-fly would set up the /rootfs volume as shared-recursive (but still in the host mnt namespace). So rbd, iscsiadm and others would be able to mount things in /rootfs and the mounts would be propagated on the real / outside of the chroot.

The kubelet started by rkt-fly also has the host's view of /dev because, despite being in a chroot, /dev is bind mounted in the chroot's /dev (see main.go#L256).

/cc @steveeJ

thereallukl commented 8 years ago

rbdnamer for ceph storage (https://github.com/ceph/ceph-docker/tree/master/examples/coreos/rbdmap)

thereallukl commented 8 years ago

is there any additional dependency to run rkt (with different stage1 images) instead of docker as container-runtime for kubelet?

robszumski commented 8 years ago

@lleszczu Here is the getting started guide but keep in mind there is still some work to do before reaching feature parity

thereallukl commented 8 years ago

@robszumski I get the general idea, but as there is an open discussion to put kubelet inside rkt container (https://github.com/coreos/bugs/issues/1051), there might be some additional dependencies to be considered.

mikedanese commented 8 years ago

@alban sounds like it should work.

Just noticed there are similar requirements for network plugins. If you want to support the various cni plugins, it would be good to plop https://storage.googleapis.com/kubernetes-release/network-plugins/cni-09214926.tar.gz into /opt/bin/cni inside the kubelet image.

robszumski commented 8 years ago

Kubernetes slack user @alvin ran into this trying to mount a Gluster volume.

jimmycuadra commented 8 years ago

/proc seems like another path required by kubelet that is not currently accounted for in kubelet-wrapper. Here's an excerpt of kubelet's logs when I tried to run it with rkt fly:

[  382.749400] hyperkube[4]: I0510 01:19:49.767214       4 server.go:683] Watching apiserver
[  382.761948] hyperkube[4]: W0510 01:19:49.779739       4 plugins.go:156] can't set sysctl net/bridge/bridge-nf-call-iptables: open /proc/sys/net/bridge/bridge-nf-call-iptables: no such file or directory
...
[  383.034925] hyperkube[4]: E0510 01:19:50.029177       4 kubelet.go:1016] Failed to start ContainerManager [open /proc/sys/vm/overcommit_memory: read-only file system, open /proc/sys/kernel/panic: read-only file system]
[  383.035156] hyperkube[4]: I0510 01:19:50.029190       4 manager.go:123] Starting to sync pod status with apiserver
[  383.035328] hyperkube[4]: I0510 01:19:50.029206       4 kubelet.go:2356] Starting kubelet main sync loop.
[  383.035485] hyperkube[4]: I0510 01:19:50.029216       4 kubelet.go:2365] skipping pod synchronization - [Failed to start ContainerManager [open /proc/sys/vm/overcommit_memory: read-only file system, open /proc/sys/kernel/panic: read-only file system] container runtime is down]

Here's the exact command I'm running with:

rkt --insecure-options image run --volume etc-kubernetes,kind=host,source=/etc/kubernetes --volume etc-ssl-certs,kind=host,source=/usr/share/ca-certificates --volume var-lib-docker,kind=host,source=/var/lib/docker --volume var-lib-kubelet,kind=host,source=/var/lib/kubelet --volume os-release,kind=host,source=/usr/lib/os-release --volume run,kind=host,source=/run --mount volume=etc-kubernetes,target=/etc/kubernetes --mount volume=etc-ssl-certs,target=/etc/ssl/certs --mount volume=var-lib-docker,target=/var/lib/docker --mount volume=var-lib-kubelet,target=/var/lib/kubelet --mount volume=os-release,target=/etc/os-release --mount volume=run,target=/run docker://gcr.io/google_containers/hyperkube:v1.2.3 --exec /hyperkube -- kubelet --allow-privileged=true --api-servers=http://127.0.0.1:8080 --cadvisor-port=0 --cluster-dns=10.3.0.10 --cluster-domain=cluster.local --config=/etc/kubernetes/manifests --hostname-override=10.0.1.122 --logtostderr=true --register-schedulable=false --v=2

Note that I'm running this directly, not using kubelet-wrapper, because I want to use the official gcr.io hyperkube, not the CoreOS-specific one on quay.io. All the options to rkt in the above command are taken from kubelet-wrapper, though.

robszumski commented 8 years ago

Related upstream PR to use the kernel-level RDB features instead of shelling out to the CLI tools: https://github.com/kubernetes/kubernetes/issues/23518

untoreh commented 8 years ago

the image still needs modprobe though ? https://github.com/kubernetes/kubernetes/issues/23924

edevil commented 7 years ago

I'm using the coreos/hyperkube:v1.8.0_coreos.0 image and have run into the same problem while running the proxy:

time="2017-10-02T14:49:56Z" level=warning msg="Running modprobe ip_vs failed with message: ``, error: exec: \"modprobe\": executable file not found in $PATH"
time="2017-10-02T14:49:56Z" level=error msg="Could not get ipvs family information from the kernel. It is possible that ipvs is not enabled in your kernel. Native loadbalancing will not work until this is fixed."

Are you guys planning to add "modprobe" to the image or should I just bind mount the hosts /sbin/modprobe?

zxpower commented 6 years ago

I have the same issue - modprobe isn't found in hyperkube container. Tried to bind by adding it to kubelet.service as mount, but it didn't helped.

edevil commented 6 years ago

I just ran into another problem related to not having "modprobe" in the image: https://github.com/kubernetes/kubernetes/issues/53396.

I've also tried using kubelet-wrapper with official hyperkube images from gcr.io but I still have that problem. Around v1.8.0-alpha.2 -> v1.8.0-alpha.3 the modprobe binary disappeared...