Open edevil opened 7 years ago
@edevil Thanks for the report, I will look into this. Can you share with me what guide or configuration you followed to setup kubernetes with the kubelet-wrapper? Particularly, I am interested in what the RKT_OPTS env for the kubelet wrapper was set to.
It appears that ebtables exists in the hyperkube image and dynamically links to /lib/ebtables/libebtc.so
, among other libraries. It seems likely that the path to that library is getting clobbered by bindmounts to the host fs from the kubelet's container.
I followed CoreOS's guide. But I had to manually add 2 mount points for https://github.com/coreos/coreos-kubernetes/issues/322 and https://github.com/coreos/bugs/issues/1229.
So my final RKT_OPTS is:
Environment="RKT_OPTS=--volume var-log,kind=host,source=/var/log --mount volume=var-log,target=/var/log --volume=usr-sbin,kind=host,source=/usr/sbin --mount=volume=usr-sbin,target=/usr/sbin"
@edevil On CoreOS, ebtables exists at /usr/sbin/ebtabes
. This mount point seems like the culprit, I suspect ebtables on the host is not playing nicely with the ebtables dynamic libs found in the hyperkube rootfs.
When adding host mounts to the hyperkube container it is best to be as specific as possible about what you are mounting in. Mounting in dynamically linked binaries is fraught with peril and best avoided altogether. If you absolutely can't avoid it you must make sure it has access to the libraries it needs.
What binaries in sbin do you need to mount into the hyperkube container? You might be able to get away with simply being more specific about what you mount under sbin.
Additionally, you may want to consider using the generic scripts in coreos-kubernetes which tend to be more up to date then those guides. We also have a tool for installing self-hosted clusters called bootkube and a tool for launching clusters on aws, kube-aws.
The executable in question is brctl, and the issue in question is https://github.com/coreos/bugs/issues/1229. Kubelet needs it.
Ok, it seems either brctl is now supplied in the image or kubelet doesn't need it anymore. I removed the mountpoint, the cbr0 bridge was still created, but I get another error message regarding DEDUP:
Dec 16 11:17:43 node-0-vm kubelet-wrapper[2107]: E1216 11:17:43.067602 2107 kubenet_linux.go:803] Failed to flush dedup chain: Failed to flush filter chain KUBE-DEDUP: exit status 255, output: modprobe: ERROR: ../libkmod/libkmod.c:557 kmod_search_moddep() could not open moddep file '/lib/modules/4.7.3-coreos-r3/modules.dep.bin'
Dec 16 11:17:43 node-0-vm kubelet-wrapper[2107]: The kernel doesn't support the ebtables 'filter' table.
Dec 16 11:17:43 node-0-vm kubelet-wrapper[2107]: E1216 11:17:43.163882 2107 kubenet_linux.go:815] Failed to ensure filter chain KUBE-DEDUP
@edevil: You should be able to use this mountpoint to fix the modprobe error:
--volume lib-modules,kind=host,source=/lib/modules \
--mount volume=lib-modules,target=/lib/modules \
This is something we should consider adding to the kubelet-wrapper by default. I think this issue has gone unnoticed because we normally use CNI for the network plugin rather then kubenet. /lib/modules
is a good canidate for being a permanenet mountpoint since its coupled to the underlying kernel to some degree.
@euank Any thoughts on adding /lib/modules
to kubelet-wrapper?
@pbx0 I can't think of a reason it would be a problem
@pbx0 Thanks. I feel like I'm getting closer, but I still get errors that may be related to missing mountpoints:
Dec 16 23:41:48 node-1-vm kubelet-wrapper[922]: 2016/12/16 23:41:48 Error retriving last reserved ip: Failed to retrieve last reserved ip: open /var/lib/cni/networks/kubenet/last_reserved_ip: no such file or directory
Dec 16 23:41:49 node-1-vm kubelet-wrapper[922]: E1216 23:41:49.385553 922 kubenet_linux.go:803] Failed to flush dedup chain: Failed to flush filter chain KUBE-DEDUP: exit status 255, output: Chain 'KUBE-DEDUP' doesn't exist.
Can I do something about these?
@edevil: I'm not sure why your seeing this failure. I'm not convinced its an issue with what you are bind-mounting in because /var/lib/cni/networks...
I think might actually be created by the kubenet plugin.
Why have you chosen to run the kubenet plugin? Presumably you are running on gce and don't want/need to use flannel?
I'm running on Azure and so don't need to use flannel.
Well, if those two errors are not CoreOS specific then I'll just take it up with the kubernetes guys. Thank you and feel free to close this ticket.
It seems to me kubenet is expecting /var/lib/cni to be persistent between reboots, since it stores some information there like the last reserved IP. I've added an additional mountpoint:
--volume var-cni,kind=host,source=/var/lib/cni --mount volume=var-cni,target=/var/lib/cni
The error message about the cni data is gone. The one about the ebtables chain not being present seems harmless since that is to be expected after a restart and the chain is created afterwards.
Nice work figuring that out. I'll keep in mind kubenet needs that dir to persist in the future. Is everything coming up for now? If so, i'll close this out.
Yes, these "errors" seem to be either mitigated or not important. This can be closed.
I would, however, like to see /var/lib/cni
and /lib/modules
added to kubelet-wrapper
.
cc @crawford ^^
Issue Report
Bug
CoreOS Version
Environment
Azure Standard_A1
Expected Behavior
No errors.
Actual Behavior
Reproduction Steps