coreos / bugs

Issue tracker for CoreOS Container Linux
https://coreos.com/os/eol/
146 stars 30 forks source link

kubelet-wrapper error message regarding missing shared library #1712

Open edevil opened 7 years ago

edevil commented 7 years ago

Issue Report

Bug

CoreOS Version

NAME=CoreOS
ID=coreos
VERSION=1185.5.0
VERSION_ID=1185.5.0
BUILD_ID=2016-12-07-0937
PRETTY_NAME="CoreOS 1185.5.0 (MoreOS)"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"

Environment

Azure Standard_A1

Expected Behavior

No errors.

Actual Behavior

Dec 15 11:22:04 node-1-vm kubelet-wrapper[932]: E1215 11:22:04.355622     932 kubenet_linux.go:803] Failed to flush dedup chain: Failed to flush filter chain KUBE-DEDUP: exit status 127, output: ebtables: error while loading shared libraries: libebtc.so: cannot open shared object file: No such file or directory
Dec 15 11:22:04 node-1-vm kubelet-wrapper[932]: W1215 11:22:04.356897     932 kubenet_linux.go:808] Failed to get ebtables version. Skip syncing ebtables dedup rules: exit status 127

Reproduction Steps

  1. Setup Kubernetes 1.5 using kubelet-wrapper
peebs commented 7 years ago

@edevil Thanks for the report, I will look into this. Can you share with me what guide or configuration you followed to setup kubernetes with the kubelet-wrapper? Particularly, I am interested in what the RKT_OPTS env for the kubelet wrapper was set to.

It appears that ebtables exists in the hyperkube image and dynamically links to /lib/ebtables/libebtc.so, among other libraries. It seems likely that the path to that library is getting clobbered by bindmounts to the host fs from the kubelet's container.

edevil commented 7 years ago

I followed CoreOS's guide. But I had to manually add 2 mount points for https://github.com/coreos/coreos-kubernetes/issues/322 and https://github.com/coreos/bugs/issues/1229.

So my final RKT_OPTS is:

Environment="RKT_OPTS=--volume var-log,kind=host,source=/var/log --mount volume=var-log,target=/var/log --volume=usr-sbin,kind=host,source=/usr/sbin --mount=volume=usr-sbin,target=/usr/sbin"
peebs commented 7 years ago

@edevil On CoreOS, ebtables exists at /usr/sbin/ebtabes. This mount point seems like the culprit, I suspect ebtables on the host is not playing nicely with the ebtables dynamic libs found in the hyperkube rootfs.

When adding host mounts to the hyperkube container it is best to be as specific as possible about what you are mounting in. Mounting in dynamically linked binaries is fraught with peril and best avoided altogether. If you absolutely can't avoid it you must make sure it has access to the libraries it needs.

What binaries in sbin do you need to mount into the hyperkube container? You might be able to get away with simply being more specific about what you mount under sbin.

Additionally, you may want to consider using the generic scripts in coreos-kubernetes which tend to be more up to date then those guides. We also have a tool for installing self-hosted clusters called bootkube and a tool for launching clusters on aws, kube-aws.

edevil commented 7 years ago

The executable in question is brctl, and the issue in question is https://github.com/coreos/bugs/issues/1229. Kubelet needs it.

edevil commented 7 years ago

Ok, it seems either brctl is now supplied in the image or kubelet doesn't need it anymore. I removed the mountpoint, the cbr0 bridge was still created, but I get another error message regarding DEDUP:

Dec 16 11:17:43 node-0-vm kubelet-wrapper[2107]: E1216 11:17:43.067602    2107 kubenet_linux.go:803] Failed to flush dedup chain: Failed to flush filter chain KUBE-DEDUP: exit status 255, output: modprobe: ERROR: ../libkmod/libkmod.c:557 kmod_search_moddep() could not open moddep file '/lib/modules/4.7.3-coreos-r3/modules.dep.bin'
Dec 16 11:17:43 node-0-vm kubelet-wrapper[2107]: The kernel doesn't support the ebtables 'filter' table.
Dec 16 11:17:43 node-0-vm kubelet-wrapper[2107]: E1216 11:17:43.163882    2107 kubenet_linux.go:815] Failed to ensure filter chain KUBE-DEDUP
peebs commented 7 years ago

@edevil: You should be able to use this mountpoint to fix the modprobe error:

--volume lib-modules,kind=host,source=/lib/modules \
--mount volume=lib-modules,target=/lib/modules \

This is something we should consider adding to the kubelet-wrapper by default. I think this issue has gone unnoticed because we normally use CNI for the network plugin rather then kubenet. /lib/modules is a good canidate for being a permanenet mountpoint since its coupled to the underlying kernel to some degree.

@euank Any thoughts on adding /lib/modules to kubelet-wrapper?

euank commented 7 years ago

@pbx0 I can't think of a reason it would be a problem

edevil commented 7 years ago

@pbx0 Thanks. I feel like I'm getting closer, but I still get errors that may be related to missing mountpoints:

Dec 16 23:41:48 node-1-vm kubelet-wrapper[922]: 2016/12/16 23:41:48 Error retriving last reserved ip: Failed to retrieve last reserved ip: open /var/lib/cni/networks/kubenet/last_reserved_ip: no such file or directory
Dec 16 23:41:49 node-1-vm kubelet-wrapper[922]: E1216 23:41:49.385553     922 kubenet_linux.go:803] Failed to flush dedup chain: Failed to flush filter chain KUBE-DEDUP: exit status 255, output: Chain 'KUBE-DEDUP' doesn't exist.

Can I do something about these?

peebs commented 7 years ago

@edevil: I'm not sure why your seeing this failure. I'm not convinced its an issue with what you are bind-mounting in because /var/lib/cni/networks... I think might actually be created by the kubenet plugin.

Why have you chosen to run the kubenet plugin? Presumably you are running on gce and don't want/need to use flannel?

edevil commented 7 years ago

I'm running on Azure and so don't need to use flannel.

Well, if those two errors are not CoreOS specific then I'll just take it up with the kubernetes guys. Thank you and feel free to close this ticket.

edevil commented 7 years ago

It seems to me kubenet is expecting /var/lib/cni to be persistent between reboots, since it stores some information there like the last reserved IP. I've added an additional mountpoint:

--volume var-cni,kind=host,source=/var/lib/cni --mount volume=var-cni,target=/var/lib/cni

The error message about the cni data is gone. The one about the ebtables chain not being present seems harmless since that is to be expected after a restart and the chain is created afterwards.

peebs commented 7 years ago

Nice work figuring that out. I'll keep in mind kubenet needs that dir to persist in the future. Is everything coming up for now? If so, i'll close this out.

edevil commented 7 years ago

Yes, these "errors" seem to be either mitigated or not important. This can be closed.

I would, however, like to see /var/lib/cni and /lib/modules added to kubelet-wrapper.

peebs commented 7 years ago

cc @crawford ^^