Closed andrey01 closed 7 years ago
This is probably not a Heapster issue. Are other pods working properly on your cluster? Heapster is using the standard DNS code from the Kubernetes libraries here. Are you actually injecting cluster DNS?
@DirectXMan12 the other pods are working properly on our cluster, we are running many different ones.
I have just tried the heapster v1.4.0
with K8s v1.7.5
and it has the same issue.
I am nearly out of ideas, will have to dig deeper probably.
Meanwhile, if you have some ideas what could I try, let me know please.
Update 1
This is the way we run the heapster, kubectl apply the following manifests:
1. https://github.com/kubernetes/heapster/blob/1308dd71f0ba343895456b46d1bbf3238800b6f3/deploy/kube-config/rbac/heapster-rbac.yaml
2. https://github.com/kubernetes/heapster/blob/480a1ca21fe65882dd41f28cd9296a825fbbb602/deploy/kube-config/influxdb/heapster.yaml
3. https://github.com/kubernetes/heapster/blob/480a1ca21fe65882dd41f28cd9296a825fbbb602/deploy/kube-config/influxdb/influxdb.yaml
4. https://github.com/kubernetes/heapster/blob/69a4067c236032770c9145bba6634795454b88b1/deploy/kube-config/influxdb/grafana.yaml
Update 2
I have just double checked it in K8s 1.7.5, CentOS 7.3, docker 1.13.1 for all the available heapster docker images, so results are following:
v1.3.0-alpha.1
, v1.3.0-beta.1
are working:
# kubectl -n kube-system logs --tail=1 -f heapster-1029728424-jnd84
I0915 13:44:59.804446 5 heapster.go:105] Starting heapster on port 8082
I0915 13:45:05.117265 5 influxdb.go:215] Created database "k8s" on influxDB server at "monitoring-influxdb.kube-system.svc:8086"
v1.3.0
, v1.4.0-beta.0
, v1.4.0
, v1.4.1
, v1.4.2
, v1.5.0-beta.0
are not.
md5sum binary_path version commit
26a76e1037c987766c480ae07a713a07 /mnt/docker/overlay/e6068f9abe9d3f6935d7555be29c75d6417f44131254bcd877c8d126ad747e26/root/heapster v1.3.0-alpha.1 da3aab91
ca883b67c48fcedb7d69165e78edf9f0 /mnt/docker/overlay/3f7ce0183025b48aa926caee676eebb5de6993e99c1064d9cf70139875e66d43/root/heapster v1.3.0-beta.1 4bf87de3
--- below version are not working for me ---
01e20c88c736070d57392ddcf0efc1ae /mnt/docker/overlay/044ba8b80c29fea0c4c8afb88a272c1fa735dd0a1c41ed51fa53a36ae799053e/root/heapster v1.3.0 abbfe6e0
3886e3f3b4f2b843e08067fd92a2bca1 /mnt/docker/overlay/a1b912c5ea9b6e63579e187f169fb336002bd9bcf9c0ffbc8b5e8fbf0a975afe/root/heapster v1.4.0-beta.0 a729275e
038f85fd510a45dab4bbe4cfe547b0b8 /mnt/docker/overlay/92b867dd8f82beb48bbefda7e20557ffc4d5f85c7a2b5c99e4cf13b4e7c75865/root/heapster v1.4.0 0f6d09f4
22a3cff26dfcfef5674c9d51336f9248 /mnt/docker/overlay/d52c27f15b0e053b8fbe9f069f66194fd1858be5d37dee6468e29951f1fbda4b/root/heapster v1.4.1 d7df1c0f
53b67e2f14099c20520bfae80a087977 /mnt/docker/overlay/04bf13900b2a372be91f3e5af6069f18c8e0ab15ac88d766e8e01bac770977f1/root/heapster v1.4.2 ae12a18a
191700a644713f0e33d1fc97f961a79c /mnt/docker/overlay/37ea86525baca321fc1c118a3bf34d07be142b91ad5021e671ee3463fc262153/root/heapster v1.5.0-beta.0 859ccd72
So I guess either issue or its trigger lies somewhere between git log 4bf87de3...abbfe6e0 --stat -p
.
Update 3
Would it make sense updating the vendor/k8s.io/apimachinery
? As on July 28 2017 IPv6 support for ChooseHostInterface was added (part 3 of 3) in https://github.com/kubernetes/apimachinery/commit/e97046a677c003086fbaac6222ae330b6cee12e8
@andrey01
/etc/resolv.conf
?@andyxning
/etc/resolv.conf
please refer to the below outputs I have just prepared:Useful for additional debugging, since the container will share the same namespace.
Slighlty modified the https://github.com/kubernetes/heapster/blob/480a1ca21fe65882dd41f28cd9296a825fbbb602/deploy/kube-config/influxdb/heapster.yaml to include a centos container running in the same namespace as a heapster.
spec:
serviceAccountName: heapster
containers:
- name: heapster
image: gcr.io/google_containers/heapster-amd64:v1.3.0
imagePullPolicy: IfNotPresent
command:
- /heapster
- --source=kubernetes:https://kubernetes.default
- --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
- name: centos
image: centos:7
imagePullPolicy: IfNotPresent
command:
- sleep
- infinity
Just to make sure:
[root@myserver1:~] docker ps -a |grep -E "centos|heapster"
488d43ce3e83 centos@sha256:eba772bac22c86d7d6e72421b4700c3f894ab6e35475a34014ff8de74c10872e "sleep infinity" 8 minutes ago Up 8 minutes k8s_centos_heapster-107544395-wzx51_kube-system_976900bf-9a21-11e7-bc9b-fa163e533806_0
4047d471bcc6 gcr.io/google_containers/heapster-amd64@sha256:3dff9b2425a196aa51df0cebde0f8b427388425ba84568721acf416fa003cd5c "/heapster --sourc..." 8 minutes ago Up 8 minutes k8s_heapster_heapster-107544395-wzx51_kube-system_976900bf-9a21-11e7-bc9b-fa163e533806_0
[root@myserver1:~] docker inspect k8s_heapster_heapster-107544395-wzx51_kube-system_976900bf-9a21-11e7-bc9b-fa163e533806_0 --format '{{.State.Pid}} - {{.HostConfig.PidMode}}'
32341 - container:011aab52b74600416ac3cfba43e33c372b8c00b1e42a6f1ae892603cd5ce8f5c
[root@myserver1:~] docker inspect k8s_centos_heapster-107544395-wzx51_kube-system_976900bf-9a21-11e7-bc9b-fa163e533806_0 --format '{{.State.Pid}} - {{.HostConfig.PidMode}}'
32407 - container:011aab52b74600416ac3cfba43e33c372b8c00b1e42a6f1ae892603cd5ce8f5c
[root@myserver1:~] ps -ef|grep -Ew "32341|32407"
nfsnobo+ 32341 32324 0 16:24 ? 00:00:03 /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
root 32407 32383 0 16:24 ? 00:00:00 sleep infinity
[root@myserver1:~] kubectl get pods --all-namespaces -o wide |grep -i heap
kube-system heapster-107544395-wzx51 2/2 Running 0 1m 10.32.0.3 myserver1
[root@myserver1:~] kubectl -n kube-system exec -ti -c centos heapster-107544395-wzx51 bash
[root@heapster-107544395-wzx51 /]# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 14:24 ? 00:00:00 /pause
65534 5 0 0 14:24 ? 00:00:00 /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
root 13 0 0 14:24 ? 00:00:00 sleep infinity
root 20 0 0 14:26 ? 00:00:00 bash
root 24 0 0 14:26 ? 00:00:00 bash
root 38 24 0 14:26 ? 00:00:00 ps -ef
[root@heapster-107544395-wzx51 /]# TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
[root@heapster-107544395-wzx51 /]# curl -sS -k -H "Authorization: Bearer ${TOKEN}" https://kubernetes.default/api/v1/nodes?resourceVersion=0 |head
{
"kind": "NodeList",
"apiVersion": "v1",
"metadata": {
"selfLink": "/api/v1/nodes",
...
As expected
[root@heapster-107544395-wzx51 /]# ping6 kubernetes.default
ping: kubernetes.default: Name or service not known
[root@heapster-107544395-wzx51 /]# yum install -y net-tools bind-utils
[root@heapster-107544395-wzx51 /]# nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10#53
Non-authoritative answer:
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
[root@heapster-107544395-wzx51 /]# cat /etc/resolv.conf
nameserver 10.96.0.10
search kube-system.svc.cluster.local svc.cluster.local cluster.local novalocal int.na.intgdc.com na.intgdc.com
options ndots:5
Apparently heapster is constantly trying to use the IPv6, despite the fact it is not working. I think hepaster should try then IPv4 instead, unless something is broken within the hepaster container inside.
@andyxning is there a way to see what /etc/resolv.conf
uses the heapster container itself? I've noticed that it is pretty slim and does not have sh
/ bash
/ cat
tools inside. Edit okay I found it, see below in this comment. :-)
[root@heapster-107544395-wzx51 /]# yum install -y iproute tcpdump
[root@heapster-107544395-wzx51 /]# tcpdump -qenn -i any
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
14:45:35.489651 In 00:00:00:00:00:00 127.0.0.1.51835 > 127.0.0.1.53: UDP, length 36
14:45:35.489672 In 00:00:00:00:00:00 127.0.0.1 > 127.0.0.1: ICMP 127.0.0.1 udp port 53 unreachable, length 72
14:45:35.489788 In 00:00:00:00:00:00 ::1.49641 > ::1.53: UDP, length 36
14:45:35.489800 In 00:00:00:00:00:00 ::1 > ::1: ICMP6, destination unreachable, unreachable port, ::1 udp port 53, length 92
14:45:35.489902 In 00:00:00:00:00:00 127.0.0.1.43764 > 127.0.0.1.53: UDP, length 36
14:45:35.489908 In 00:00:00:00:00:00 127.0.0.1 > 127.0.0.1: ICMP 127.0.0.1 udp port 53 unreachable, length 72
14:45:35.489944 In 00:00:00:00:00:00 127.0.0.1.38169 > 127.0.0.1.53: UDP, length 36
14:45:35.877095 In 00:00:00:00:00:00 127.0.0.1.46479 > 127.0.0.1.53: UDP, length 36
14:45:35.877095 In 00:00:00:00:00:00 127.0.0.1.38827 > 127.0.0.1.53: UDP, length 36
14:45:35.877110 In 00:00:00:00:00:00 127.0.0.1 > 127.0.0.1: ICMP 127.0.0.1 udp port 53 unreachable, length 72
14:45:35.877110 In 00:00:00:00:00:00 127.0.0.1 > 127.0.0.1: ICMP 127.0.0.1 udp port 53 unreachable, length 72
Update 1
IPv6 isn't apparently fully disabled, from the container's PoV:
[root@heapster-107544395-wzx51 /]# ping6 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.034 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=0.033 ms
^C
--- ::1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.033/0.033/0.034/0.005 ms
[root@heapster-107544395-wzx51 /]# ping6 localhost
PING localhost(localhost (::1)) 56 data bytes
64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.032 ms
64 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.459 ms
^C
--- localhost ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.032/0.245/0.459/0.214 ms
Whilst on the host it is:
due to a Host's
/etc/hosts
does not have the following line set: ::1 localhost ip6-localhost ip6-loopback
Edit but having it there and restarting a dockerd, recreating the heapster container, did not help.
[root@myserver1:~] ping6 localhost
ping: localhost: Name or service not known
This is due
net.ipv6.conf.all.disable_ipv6=1
:
[root@myserver1:~] ping6 ::1
PING ::1(::1) 56 data bytes
ping: sendmsg: Network is unreachable
ping: sendmsg: Network is unreachable
^C
Update 2
Okay here is the /etc/resolv.conf
from the heapster container
[root@myserver1:~] docker inspect k8s_centos_heapster-107544395-wzx51_kube-system_976900bf-9a21-11e7-bc9b-fa163e533806_0 --format '{{.ResolvConfPath}}'
/mnt/docker/containers/011aab52b74600416ac3cfba43e33c372b8c00b1e42a6f1ae892603cd5ce8f5c/resolv.conf
[root@myserver1:~] cat /mnt/docker/containers/011aab52b74600416ac3cfba43e33c372b8c00b1e42a6f1ae892603cd5ce8f5c/resolv.conf
nameserver 10.96.0.10
search kube-system.svc.cluster.local svc.cluster.local cluster.local novalocal int.na.ourdomainXYZ.com na.ourdomainXYZ.com
options ndots:5
And the /etc/hosts
[root@myserver1:~] cat /mnt/docker/containers/011aab52b74600416ac3cfba43e33c372b8c00b1e42a6f1ae892603cd5ce8f5c/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
just to double-ensure we are checking the right path
[root@myserver1:~] cat /mnt/docker/containers/011aab52b74600416ac3cfba43e33c372b8c00b1e42a6f1ae892603cd5ce8f5c/hostname
heapster-107544395-wzx51
Update 3
tcpdump'ing on the host
The failed IPv6 attempts ::1 > ::1: ICMP6, destination unreachable, unreachable port, ::1 udp port 53
seen inside a heapster POD (from a CentOS container), are never reaching outside to the host. So they are stopped within the heapster
POD without going anywhere outside.
Update 4
Tried stracing the hepaster, according to what it shows, it appears it is trying to ask a local DNS over a 53/UDP
, first IPv4 (AF_INET) 127.0.0.1
, then IPv6 (AF_INET6) ::1
every second
https://gist.github.com/andrey01/936a5577ecf9aaa8af4a66570cc8f5cc
And all attempts fail.
tcpdump with more details -AA
shows that it is asking for kubernetes.default
[root@heapster-107544395-wzx51 /]# tcpdump -qenn -i any -AA
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
15:23:25.505865 In 00:00:00:00:00:00 127.0.0.1.51113 > 127.0.0.1.53: UDP, length 36
................E..@".@.@..............5.,.?M...........
kubernetes.default.....................
15:23:25.505877 In 00:00:00:00:00:00 127.0.0.1 > 127.0.0.1: ICMP 127.0.0.1 udp port 53 unreachable, length 72
................E..\>...@.=P...........\....E..@".@.@..............5.,.?M...........
kubernetes.default.....................
15:23:25.505964 In 00:00:00:00:00:00 ::1.59300 > ::1.53: UDP, length 36
................`....,.@...................................5.,.?............
kubernetes.default.....................
15:23:25.505975 In 00:00:00:00:00:00 ::1 > ::1: ICMP6, destination unreachable, unreachable port, ::1 udp port 53, length 92
................`....\:@...................................Y....`....,.@...................................5.,.?............
kubernetes.default.....................
15:23:25.506049 In 00:00:00:00:00:00 127.0.0.1.49516 > 127.0.0.1.53: UDP, length 36
................E..@".@.@............l.5.,.?m...........
kubernetes.default.....................
We are running a dockerd with --iptables=false
, I have been also trying to set it to true
, restarting the dockerd but that did not do any difference. Neither have helped having the ::1 localhost ip6-localhost ip6-loopback
in /etc/hosts
on the main host.
@andrey01 IIUC, when query an dns, client should not query 127.0.0.1:53
, it should consult nameservers directly. Seems something is wrong.
@andrey01 What is the result of nslookup monitoring-influxdb.kube-system.svc
inside the heapster pod?
What is the result of nslookup monitoring-influxdb.kube-system.svc.cluster.local.
inside the heapster pod with the fully qualified domain name(special note for the suffix dot after local
)?
Seems that it can not do a dns query for monitoring-influxdb.kube-system.svc
.
@andrey01
is there a way to see what /etc/resolv.conf uses the heapster container itself?
I am afraid that you can only docker inspect heapster-POD_NON_POD_INFRA_CONTAINER
and search for ResolvConfPath
mapping to get the content for resolv.conf
.
@andyxning here you go:
[root@heapster-487402730-frx1w /]# nslookup monitoring-influxdb.kube-system.svc
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: monitoring-influxdb.kube-system.svc.cluster.local
Address: 10.102.222.93
[root@heapster-487402730-frx1w /]# nslookup monitoring-influxdb.kube-system.svc.cluster.local.
Server: 10.96.0.10
Address: 10.96.0.10#53
Non-authoritative answer:
Name: monitoring-influxdb.kube-system.svc.cluster.local
Address: 10.102.222.93
[root@myserver1:~] docker inspect k8s_heapster_heapster-487402730-frx1w_kube-system_1ad7d84a-9bc3-11e7-b167-fa163e533806_0 |grep -i resolv
"ResolvConfPath": "/mnt/docker/containers/b66a55cc326bc07635fb71dde2138f3eddc95d672bd0c037b6b74926d5804709/resolv.conf",
[root@myserver1:~] cat /mnt/docker/containers/b66a55cc326bc07635fb71dde2138f3eddc95d672bd0c037b6b74926d5804709/resolv.conf
nameserver 10.96.0.10
search kube-system.svc.cluster.local svc.cluster.local cluster.local novalocal int.na.mydomainXYZ.com na.mydomainXYZ.com
options ndots:5
[root@myserver1:~] kubectl -n kube-system exec -ti -c centos heapster-487402730-frx1w bash
[root@heapster-487402730-frx1w /]# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 16:13 ? 00:00:00 /pause
65534 7 0 0 16:13 ? 00:00:00 /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
root 15 0 0 16:13 ? 00:00:00 sleep infinity
root 21 0 0 16:14 ? 00:00:00 bash
root 37 21 0 16:14 ? 00:00:00 ps -ef
[root@heapster-487402730-frx1w /]# cat /etc/resolv.conf
nameserver 10.96.0.10
search kube-system.svc.cluster.local svc.cluster.local cluster.local novalocal int.na.mydomainXYZ.com na.mydomainXYZ.com
options ndots:5
I think the problem is that the heapster is trying to perform a DNS query over 127.0.0.1
or ::1
instead of reading the nameserver 10.96.0.10
from the /etc/resolv.conf
. tcpdump confirms this from both, the container network and the VM host network.
Update 1
I can see there were many changes starting with the heapster 1.3.0, not sure whether this could be related, probably not.. nevertheless, just to mention, ResolvConfDefault = "/etc/resolv.conf"
was deleted in https://github.com/kubernetes/heapster/pull/1537/files#diff-c2a7e6ffb19bafecc85ea9765043d927L21
Update 2
@andyxning
IIUC, when query an dns, client should not query 127.0.0.1:53, it should consult nameservers directly. Seems something is wrong.
Yes, absolutely. So far we have experienced this issue only with the heapster starting from v1.3.0 and higher.
It normally queries the nameserver found in /etc/resolv.conf
as expected - 10.96.0.10:53 (kube-dns).
[root@heapster-487402730-frx1w /]# dig A monitoring-influxdb.kube-system.svc.cluster.local. | grep -EA1 'ANSWER SECTION|SERVER'
;; ANSWER SECTION:
monitoring-influxdb.kube-system.svc.cluster.local. 3 IN A 10.102.222.93
--
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Sun Sep 17 16:38:49 UTC 2017
Here it fails as supposed to.
[root@heapster-487402730-frx1w /]# dig A monitoring-influxdb.kube-system.svc.cluster.local. @127.0.0.1 | grep -EA1 'ANSWER SECTION|SERVER'
[root@heapster-487402730-frx1w /]# echo $?
1
The question is why does the heapster uses the 127.0.0.1 instead of the nameserver.
Okay, moving the heapster
binary to the other container-base image such as CentOS 7 seem to have worked around the issue.
Dockerfile:
FROM centos:7
COPY heapster-1.3.0 /heapster
docker build -t local/heapster-amd64:v1.3.0-centos .
[root@myserver1:~] kubectl -n kube-system logs -c heapster -f heapster-3709176945-v26zc
I0917 16:52:03.751478 5 heapster.go:72] /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
I0917 16:52:03.751672 5 heapster.go:73] Heapster version v1.3.0
I0917 16:52:03.752099 5 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version v1
I0917 16:52:03.752136 5 configs.go:62] Using kubelet port 10255
I0917 16:52:03.810163 5 influxdb.go:252] created influxdb sink with options: host:monitoring-influxdb.kube-system.svc:8086 user:root db:k8s
I0917 16:52:03.810186 5 heapster.go:196] Starting with InfluxDB Sink
I0917 16:52:03.810190 5 heapster.go:196] Starting with Metric Sink
I0917 16:52:03.833103 5 heapster.go:106] Starting heapster on port 8082
@andyxning how is the heapster docker image getting created?
Update 1
Yet another workaround was to start heapster with UID 0 (root):
spec:
securityContext:
runAsUser: 0
Then everything is working.
I am starting to think that heapster is having some issues with accessing the /etc/resolv.conf
file, running from the heapster 1.3.0+ docker images.
I have checked the inner parts of the images, they look pretty simple. The major difference is that starting with heapster 1.3.0+ the directive USER 65534:65534 is used.
Here are the permissions down to the /etc/resolv.conf
:
[root@myserver1:~] ls -lad /mnt/docker/containers/0496e74cb50f6b508917641a0e6f88020fb5c5a7b974531b02ff39c4d3bf9c16/resolv.conf /mnt/docker/containers/0496e74cb50f6b508917641a0e6f88020fb5c5a7b974531b02ff39c4d3bf9c16 /mnt/docker/containers /mnt/docker /mnt /
dr-xr-xr-x. 18 root root 4096 Sep 15 14:50 /
drwxr-xr-x. 7 root root 4096 Sep 15 14:46 /mnt
drwx--x--x. 11 root root 4096 Sep 15 14:48 /mnt/docker
drwxr-x---+ 86 root root 24576 Sep 17 19:09 /mnt/docker/containers
drwxr-x---+ 4 root root 4096 Sep 17 19:09 /mnt/docker/containers/0496e74cb50f6b508917641a0e6f88020fb5c5a7b974531b02ff39c4d3bf9c16
-rw-r--r--+ 1 root root 149 Sep 17 19:09 /mnt/docker/containers/0496e74cb50f6b508917641a0e6f88020fb5c5a7b974531b02ff39c4d3bf9c16/resolv.conf
Update 2
So, when I am setting the USER 65534
in the Dockerfile (to simulate the difference observed in heapster 1.3.0+ docker images),
the /etc/resolv.conf
becomes unreadable for that user.
[root@myserver1:~] docker run --rm -ti --entrypoint bash local/heapster-amd64:v1.3.0-centos
bash-4.2$ id
uid=65534 gid=0(root) groups=0(root)
bash-4.2$ ping google.com
ping: google.com: Name or service not known
bash-4.2$ cat /etc/resolv.conf
cat: /etc/resolv.conf: Permission denied
bash-4.2$ ls -lad / /etc /etc/resolv.conf
drwxr-xr-x. 1 root root 4096 Sep 17 18:03 /
drwxr-xr-x. 1 root root 4096 Sep 17 18:03 /etc
-rw-r--r--+ 1 root root 179 Sep 17 18:03 /etc/resolv.conf
Interestingly, the same is not observed when setting the user to, say 99
.
EDIT the reason why it was not observed with 99
because 99
uid had a record in the image's /etc/passwd
file, for more details please refer to https://github.com/kubernetes/heapster/issues/1658#issuecomment-331196813 or https://github.com/kubernetes/heapster/pull/1817
[root@myserver1:~] grep -Ew "12345|65534|99" /etc/passwd
nobody:x:99:99:Nobody:/:/sbin/nologin
nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin
[root@myserver1:~] docker run --rm -ti -u 65534 --entrypoint bash local/heapster-amd64:v1.3.0-centos -c "cat /etc/resolv.conf"
cat: /etc/resolv.conf: Permission denied
[root@myserver1:~] docker run --rm -ti -u 12345 --entrypoint bash local/heapster-amd64:v1.3.0-centos -c "cat /etc/resolv.conf"
cat: /etc/resolv.conf: Permission denied
[root@myserver1:~] docker run --rm -ti -u 99 --entrypoint bash local/heapster-amd64:v1.3.0-centos -c "cat /etc/resolv.conf"
; generated by /usr/sbin/dhclient-script
search [...redacted]
nameserver [...redacted]
Looks like the issue is local and not heapster related.
] docker run --rm -ti -u 65534 --entrypoint bash centos:7 -c "cat /etc/resolv.conf"
cat: /etc/resolv.conf: Permission denied
The thing is that heapster is running under 65534 UID since v1.3.0+ which has triggered this issue.
I will try to figure it out and will close this issue.
Yes, this was our local issue due to a over-complicated ACLs ...
To those who will wonder, when setting the Default ACLs (-d
with setfacl), do not forget to explicitly set the root, e.g.:
/usr/bin/setfacl -Rdm group:splunk:rx,other:r /mnt/docker/containers
/usr/bin/setfacl -Rdm group:root:r,group:splunk:rx,other:r /mnt/docker/containers
Edit 1
So we must ensure group:root:r
is additionally set when setting the Default ACL's (setfacl command with the -d
flag).
This brings the default Docker behavior, when a docker container process is running under a non-root user is able to read the /etc/resolv.conf
file.
Interestingly, the user with UID 99 (nobody, in CentOS 7.3) does not experience the same issue.
@andrey01 IIUC, you mean that the root reason for heapster to perform a DNS query against 127.0.0.1
is that heapster can not read the /etc/resolv.conf
file?
@andrey01 The docker file used to build heapster is at here.
FROM scratch
COPY heapster eventer /
COPY ca-certificates.crt /etc/ssl/certs/
# nobody:nobody
USER 65534:65534
ENTRYPOINT ["/heapster"]
Actually, according to the settings comment, it is expected that 65534 is the UID of nobody
. Maybe this is not true across all linux distributions.
IIUC, you mean that the root reason for heapster to perform a DNS query against 127.0.0.1 is that heapster can not read the /etc/resolv.conf file?
... it is expected that 65534 is the UID of nobody. Maybe this is not true across all linux distributions.
@andyxning you are right:
Ubuntu xenial
$ id nobody
uid=65534(nobody) gid=65534(nogroup) groups=65534(nogroup)
CentOS 7
# id nobody
uid=99(nobody) gid=99(nobody) groups=99(nobody)
# id nfsnobody
uid=65534(nfsnobody) gid=65534(nfsnobody) groups=65534(nfsnobody)
But the real issue lies somewhere else... where /etc/resolv.conf
is unreadable to others. I have tried to completely remove /mnt/docker/containers
directory, restarting the dockerd, applying our old ACL setting which has other:r
, and I cannot reproduce this issue since /etc/resolv.conf
is readable with 12345:12345 (non-existing user/group) and with 65534:65534 (nfsnobody:nfsnobody in CentOS 7).
Edit 1
I will try to put back our old deployment, in order to reproduce it one more time.
Ok, I am able to reproduce the issue:
setfacl --recursive --remove-all -- /mnt/docker/containers
setfacl -Rdm group:splunk:rx,other:r -- /mnt/docker/containers
setfacl -Rm group:splunk:rx -- /mnt/docker/containers
After that, no matter if one restarts the dockerd or reloads the overlay FS, the issue occurs again:
~] kubectl -n kube-system logs -f heapster-1428236806-xs2kj
I0918 07:54:11.813694 5 heapster.go:72] /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
I0918 07:54:11.813786 5 heapster.go:73] Heapster version v1.4.0
I0918 07:54:11.814064 5 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version v1
I0918 07:54:11.814079 5 configs.go:62] Using kubelet port 10255
E0918 07:54:11.816125 5 kubelet.go:334] Failed to load nodes: Get https://kubernetes.default/api/v1/nodes: dial tcp: lookup kubernetes.default on [::1]:53: read udp [::1]:51759->[::1]:53: read: connection refused
E0918 07:54:11.817474 5 influxdb.go:264] issues while creating an InfluxDB sink: failed to ping InfluxDB server at "monitoring-influxdb.kube-system.svc:8086" - Get http://monitoring-influxdb.kube-system.svc:8086/ping: dial tcp: lookup monitoring-influxdb.kube-system.svc on [::1]:53: read udp [::1]:49628->[::1]:53: read: connection refused, will retry on use
I0918 07:54:11.817486 5 influxdb.go:278] created influxdb sink with options: host:monitoring-influxdb.kube-system.svc:8086 user:root db:k8s
I0918 07:54:11.817540 5 heapster.go:196] Starting with InfluxDB Sink
I0918 07:54:11.817549 5 heapster.go:196] Starting with Metric Sink
E0918 07:54:11.819666 5 reflector.go:190] k8s.io/heapster/metrics/heapster.go:322: Failed to list *v1.Pod: Get https://kubernetes.default/api/v1/pods?resourceVersion=0: dial tcp: lookup kubernetes.default on [::1]:53: read udp [::1]:54248->[::1]:53: read: connection refused
E0918 07:54:11.819755 5 reflector.go:190] k8s.io/heapster/metrics/util/util.go:51: Failed to list *v1.Node: Get https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp: lookup kubernetes.default on [::1]:53: read udp [::1]:54248->[::1]:53: read: connection refused
E0918 07:54:11.819839 5 reflector.go:190] k8s.io/heapster/metrics/util/util.go:51: Failed to list *v1.Node: Get https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp: lookup kubernetes.default on [::1]:53: read udp [::1]:54248->[::1]:53: read: connection refused
E0918 07:54:11.821335 5 reflector.go:190] k8s.io/heapster/metrics/processors/namespace_based_enricher.go:84: Failed to list *v1.Namespace: Get https://kubernetes.default/api/v1/namespaces?resourceVersion=0: dial tcp: lookup kubernetes.default on [::1]:53: read udp [::1]:48313->[::1]:53: read: connection refused
E0918 07:54:11.823103 5 reflector.go:190] k8s.io/heapster/metrics/util/util.go:51: Failed to list *v1.Node: Get https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp: lookup kubernetes.default on [::1]:53: read udp [::1]:51373->[::1]:53: read: connection refused
I0918 07:54:11.909308 5 heapster.go:106] Starting heapster on port 8082
E0918 07:54:12.820936 5 reflector.go:190] k8s.io/heapster/metrics/heapster.go:322: Failed to list *v1.Pod: Get https://kubernetes.default/api/v1/pods?resourceVersion=0: dial tcp: lookup kubernetes.default on [::1]:53: read udp [::1]:56828->[::1]:53: read: connection refused
E0918 07:54:12.824619 5 reflector.go:190] k8s.io/heapster/metrics/util/util.go:51: Failed to list *v1.Node: Get https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp: lookup kubernetes.default on [::1]:53: read udp [::1]:57378->[::1]:53: read: connection refused
E0918 07:54:12.824669 5 reflector.go:190] k8s.io/heapster/metrics/util/util.go:51: Failed to list *v1.Node: Get https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp: lookup kubernetes.default on [::1]:53: read udp [::1]:57378->[::1]:53: read: connection refused
E0918 07:54:12.904243 5 reflector.go:190] k8s.io/heapster/metrics/processors/namespace_based_enricher.go:84: Failed to list *v1.Namespace: Get https://kubernetes.default/api/v1/namespaces?resourceVersion=0: dial tcp: lookup kubernetes.default on [::1]:53: read udp [::1]:42402->[::1]:53: read: connection refused
As a remedy, one would need to explicitly set the group:root:r
for the 2nd setfacl
command:
setfacl --recursive --remove-all -- /mnt/docker/containers
setfacl -Rdm group:root:r,group:splunk:rx,other:r -- /mnt/docker/containers
setfacl -Rm group:splunk:rx -- /mnt/docker/containers
Strangely, this affects only the heapster 1.3.0+ containers, nevertheless I was removing the image right before removing the heapster deployment, restarting the dockerd, reloading the overlay FS.
Running some other docker image, e.g. centos:7
or alpine:3.6
did not indicate this problem at all:
centos7] docker run --rm -ti -u 65534:65534 --entrypoint bash centos:7 -c "id; ls -la /etc/resolv.conf /etc/shells /etc/libaudit.conf; wc -l /etc/resolv.conf /etc/shells /etc/libaudit.conf;"
uid=65534 gid=65534 groups=65534
-rw-r-----. 1 root root 191 Apr 19 12:56 /etc/libaudit.conf
-rw-r--r--+ 1 root root 179 Sep 18 07:57 /etc/resolv.conf
-rw-r--r--. 1 root root 76 Jun 7 2013 /etc/shells
6 /etc/resolv.conf <<<<< OK, readable as expected
6 /etc/shells
wc: /etc/libaudit.conf: Permission denied <<<<<< OK as expected
12 total
Update 1
I have checked the differences between the working and non-working configuration for contents permissions under overlay GraphDriver LowerDir,MergedDir,UpperDir,WorkDir and the ResolvConfPath (its upper level contents) and have found one single difference which was under the latter one (top-level for ResolvConfPath):
(not working) without group:root:r
in the beforementioned ACLs:
4196597 8 drwx------ 4 root root 4096 Sep 18 10:16 /mnt/docker/containers/9494f5755bc1045b4f98c69dc05ae16d6a1c162299d1e40e4c246079b3ab9e1b/
(working) with group:root:r
:
4196505 8 drwxr-x--- 4 root root 4096 Sep 18 10:11 /mnt/docker/containers/d8f29a5c7efdea458ac988d1e99e0976b97e760cc473927f6e1d23c44755ab0c/
I still cannot get why the issue occurs only to the heapster 1.3.0+ container, but not to centos:7 or alpine:3.6 when running with -u 65534:65534
(or any other random UID/GID). I have even tried to build a new image based on a simple Dockerfile, with a USER 65534:65534
set, though have not tried with the FROM scratch
image yet.
@andyxning okay, I think I have got really close to the actual issue.
It appears that GID is not actually 65534 (nobody), but is 0 (root).
This is exactly why our Default ACL "fix" works while what it does, is explicitly allowing the group:root:r
.
So as I have finally managed to access the heapster container and see what is going on from inside.
The images built FROM scratch
can be pain to debug, so here are steps I have prepared to ease the pain:
wget https://busybox.net/downloads/binaries/1.27.1-i686/busybox
chmod +x busybox
cp busybox /proc/$(pidof heapster)/cwd/busybox
mkdir -p /proc/$(pidof heapster)/cwd/{bin,sbin,usr/bin,usr/sbin}
nsenter --mount=/proc/$(pidof heapster)/task/$(pidof heapster)/ns/mnt -- /busybox --install
] kubectl -n kube-system exec -ti heapster-1428236806-lt5wd sh
/ $ id
uid=65534 gid=0
/ $ ps -ef
PID USER TIME COMMAND
1 0 0:00 /pause
5 65534 0:09 /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
29 65534 0:00 /busybox sh
47 65534 0:00 sh
52 65534 0:00 ps -ef
/ $ stat / /etc /etc/resolv.conf |grep -E 'File:|^Access: \('
File: /
Access: (0755/drwxr-xr-x) Uid: ( 0/ UNKNOWN) Gid: ( 0/ UNKNOWN)
File: /etc
Access: (0755/drwxr-xr-x) Uid: ( 0/ UNKNOWN) Gid: ( 0/ UNKNOWN)
File: /etc/resolv.conf
Access: (0654/-rw-r-xr--) Uid: ( 0/ UNKNOWN) Gid: ( 0/ UNKNOWN)
/ $ id
uid=65534 gid=0
/ $ ls -lad / /etc /etc/resolv.conf
drwxr-xr-x 1 0 0 4096 Sep 18 09:20 /
drwxr-xr-x 1 0 0 4096 Sep 18 09:19 /etc
-rw-r--r-- 1 0 0 149 Sep 18 09:19 /etc/resolv.conf
/ $ cat /etc/resolv.conf
cat: can't open '/etc/resolv.conf': Permission denied
So you can see that the root group and others can read the /
, /etc
, /etc/resolv.conf
, but it cannot really read the resolv.conf.
So our Default ACL (if without a "fix" group:root:r
) are not permitting the root to read the resolv.conf
.
[root@myserver1:~] kubectl -n kube-system exec -ti heapster-1428236806-p59ff -- sh -c "cat /etc/resolv.conf"
cat: can't open '/etc/resolv.conf': Permission denied
[root@myserver1:~] getfacl /mnt/docker/containers/0273110ec9b94f1d7ece40be481b67170d9739e21aa9be7f0e4ccdb9d8ff4928/resolv.conf
getfacl: Removing leading '/' from absolute path names
# file: mnt/docker/containers/0273110ec9b94f1d7ece40be481b67170d9739e21aa9be7f0e4ccdb9d8ff4928/resolv.conf
# owner: root
# group: root
user::rw-
group::---
group:splunk:r-x
mask::r-x
other::r--
[root@myserver1:~] setfacl -m group:root:r,group:splunk:rx -- /mnt/docker/containers/0273110ec9b94f1d7ece40be481b67170d9739e21aa9be7f0e4ccdb9d8ff4928/resolv.conf
[root@myserver1:~] kubectl -n kube-system exec -ti heapster-1428236806-p59ff -- sh -c "cat /etc/resolv.conf"
nameserver 10.96.0.10
search kube-system.svc.cluster.local svc.cluster.local cluster.local novalocal int.na.mydomainXYZ.com na.mydomainXYZ.com
options ndots:5
[root@myserver1:~] getfacl /mnt/docker/containers/0273110ec9b94f1d7ece40be481b67170d9739e21aa9be7f0e4ccdb9d8ff4928/resolv.conf
getfacl: Removing leading '/' from absolute path names
# file: mnt/docker/containers/0273110ec9b94f1d7ece40be481b67170d9739e21aa9be7f0e4ccdb9d8ff4928/resolv.conf
# owner: root
# group: root
user::rw-
group::---
group:root:r--
group:splunk:r-x
mask::r-x
other::r--
The real question now is why are we getting gid=0
, whilst it should be 65534 as per Dockerfile. Maybe this is some Kuberntes specific? https://github.com/kubernetes/kubernetes/issues/22179
@andrey01 According to the user Dockerfile doc.
Warning: When the user does doesn’t have a primary group then the image (or the next instructions) will be run with the root group.
When there are no primary group then the heapster command will be run as root group. You need to check the primary group for the nobody
user.
@andyxning that is interesting, I have created the PR https://github.com/kubernetes/heapster/pull/1817
which intends to bring the default Docker behavior in order to preserve the relevant GID set to 65534
(nobody) since by default Kubernetes would not respect the image's USER uid:gid
set in the Dockerfile. Not sure whether there is an opened issue/pr for the Kubernetes to address this.
Prepare a Dockerfile with the user 65534:65534 (nobody:nobody)
FROM scratch
# wget https://busybox.net/downloads/binaries/1.27.1-i686/busybox
# chmod +x busybox
COPY busybox /busybox
USER 65534:65534
ENTRYPOINT ["/busybox"]
CMD ["id"]
Build the image
docker build -t scratch .
Running the image shows uid:gid as expected - 65534:65534.
docker run --rm -ti scratch
uid=65534 gid=65534
When uid is explicitly set to 65534, then gid is not getting picked, because scratch
image does not have /etc/passwd
file.
docker run --rm -u 65534 -ti scratch
uid=65534 gid=0
docker run --rm -u nobody -ti scratch
docker: Error response from daemon: linux spec user: unable to find user nobody: no matching entries in passwd file.
docker run --rm -ti scratch cat /etc/passwd
cat: can't open '/etc/passwd': No such file or directory
As a remedy, let's add the /etc/passwd
file to the scratch
image
echo 'nobody:x:65534:65534:nobody:/home:/bin/false' > passwd
Dockerfile:
FROM scratch
# wget https://busybox.net/downloads/binaries/1.27.1-i686/busybox
# chmod +x busybox
COPY busybox /busybox
COPY passwd /etc/passwd
USER 65534:65534
ENTRYPOINT ["/busybox"]
CMD ["id"]
The gid is getting picked now:
docker run --rm -ti -u nobody scratch
uid=65534(nobody) gid=65534
docker run --rm -ti -u 65534 scratch
uid=65534(nobody) gid=65534
docker build -t andrey01/scratch:passwd .
docker build -t andrey01/scratch:nopasswd .
docker push andrey01/scratch:passwd
docker push andrey01/scratch:nopasswd
k8s-job-scratch-passwd.yml:
kind: Job
apiVersion: batch/v1
metadata:
name: scratch-passwd
spec:
template:
metadata:
labels:
app: scratch-passwd
spec:
restartPolicy: OnFailure
containers:
- name: scratch-passwd
image: andrey01/scratch:passwd
imagePullPolicy: Always
k8s-job-scratch-nopasswd.yml:
kind: Job
apiVersion: batch/v1
metadata:
name: scratch-nopasswd
spec:
template:
metadata:
labels:
app: scratch-nopasswd
spec:
restartPolicy: OnFailure
containers:
- name: scratch-nopasswd
image: andrey01/scratch:nopasswd
imagePullPolicy: Always
kubectl create -f k8s-job-scratch-nopasswd.yml
kubectl create -f k8s-job-scratch-passwd.yml
$ kubectl get jobs
NAME DESIRED SUCCESSFUL AGE
scratch-nopasswd 1 1 19s
scratch-passwd 1 1 16s
$ kubectl logs $(kubectl get pods --show-all -l app=scratch-passwd --output jsonpath='{.items[0].metadata.name}')
uid=65534(nobody) gid=65534
$ kubectl logs $(kubectl get pods --show-all -l app=scratch-nopasswd --output jsonpath='{.items[0].metadata.name}')
uid=65534 gid=0
As you can see gid is 0 for the scratch container which does not have /etc/passwd
file containing the relevant uid:gid pair.
kubectl delete job scratch-nopasswd scratch-passwd
Hello,
in our environment we are setting
sysctl net.ipv6.conf.all.disable_ipv6=1
.Having this, we observed that
gcr.io/google_containers/heapster-amd64:v1.3.0
is constantly trying to use IPv6 for communication with the Cluster DNS10.96.0.10
53/udp:Though, with the
gcr.io/google_containers/heapster-amd64:v1.3.0-beta.1
(2 month older image than thev1.3.0
) we do not observe such behavior.Please let me know, if you have some clues or ideas what could be done?
Docker v1.13.1 Kubernetes v1.6.4 (via kubeadm)