kelseyhightower / kubernetes-the-hard-way

Bootstrap Kubernetes the hard way. No scripts.
Apache License 2.0
41.3k stars 14.13k forks source link

Kubernetes service IP not created #725

Open sven-borkert opened 1 year ago

sven-borkert commented 1 year ago

Hi,

I followed the tutorial until "Deploying the DNS Cluster Add-on", but the coredns pods are not going to ready status and keep restarting. The logs say the the pod cannot connect to the kubernetes service IP:

$ kubectl get pods -n kube-system
NAME                       READY   STATUS             RESTARTS         AGE
coredns-7c9cfc6995-6d7fw   0/1     CrashLoopBackOff   98 (3m26s ago)   6d8h
coredns-7c9cfc6995-h8cj4   0/1     CrashLoopBackOff   98 (3m31s ago)   6d8h

plugin/kubernetes: Get "https://10.32.0.1:443/version?timeout=32s": dial tcp 10.32.0.1:443: i/o timeout

The service 'kubernetes' looks like it should provide this cluster IP:

$ kubectl describe service kubernetes
Name:              kubernetes
Namespace:         default
Labels:            component=apiserver
                   provider=kubernetes
Annotations:       <none>
Selector:          <none>
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                10.32.0.1
IPs:               10.32.0.1
Port:              https  443/TCP
TargetPort:        6443/TCP
Endpoints:         10.240.0.10:6443,10.240.0.11:6443,10.240.0.12:6443
Session Affinity:  None
Events:            <none>

But kube-proxy on the worker nodes does not seem to have created this cluster IP:

root@worker-1:~# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N KUBE-EXTERNAL-SERVICES
-N KUBE-FIREWALL
-N KUBE-FORWARD
-N KUBE-KUBELET-CANARY
-N KUBE-NODEPORTS
-N KUBE-PROXY-CANARY
-N KUBE-PROXY-FIREWALL
-N KUBE-SERVICES
-A INPUT -j KUBE-FIREWALL
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes load balancer firewall" -j KUBE-PROXY-FIREWALL
-A INPUT -m comment --comment "kubernetes health check service ports" -j KUBE-NODEPORTS
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes load balancer firewall" -j KUBE-PROXY-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes load balancer firewall" -j KUBE-PROXY-FIREWALL
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-SERVICES -d 10.32.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp has no endpoints" -m tcp --dport 53 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 10.32.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics has no endpoints" -m tcp --dport 9153 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 10.32.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns has no endpoints" -m udp --dport 53 -j REJECT --reject-with icmp-port-unreachable

It at least knows the 10.32.0.10 of the not yet running kube-dns service, but it does not create any rules for the 10.32.0.1.

I don't see any issues in the logs of kube-proxy:

-- Boot d2bf70eb95194e7191e336d03f8512d5 --
Dez 14 21:51:08 worker-1 systemd[1]: Started Kubernetes Kube Proxy.
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.470030     794 node.go:163] Successfully retrieved node IP: 192.168.0.221
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.471807     794 server_others.go:138] "Detected node IP" address="192.168.0.221"
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.553540     794 server_others.go:206] "Using iptables Proxier"
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.553633     794 server_others.go:213] "kube-proxy running in dual-stack mode" ipFamily=IPv4
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.553651     794 server_others.go:214] "Creating dualStackProxier for iptables"
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.553687     794 server_others.go:512] "Detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for IPv6"
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.559655     794 proxier.go:262] "Setting route_localnet=1, use nodePortAddresses to filter loopback addresses for NodePorts to skip it https://issues.k8s.io/90259"
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.560240     794 proxier.go:262] "Setting route_localnet=1, use nodePortAddresses to filter loopback addresses for NodePorts to skip it https://issues.k8s.io/90259"
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.560801     794 server.go:661] "Version info" version="v1.25.4"
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.560861     794 server.go:663] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.562937     794 conntrack.go:100] "Set sysctl" entry="net/netfilter/nf_conntrack_max" value=131072
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.563447     794 conntrack.go:52] "Setting nf_conntrack_max" nf_conntrack_max=131072
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.563545     794 conntrack.go:100] "Set sysctl" entry="net/netfilter/nf_conntrack_tcp_timeout_close_wait" value=3600
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.572917     794 config.go:317] "Starting service config controller"
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.577613     794 shared_informer.go:255] Waiting for caches to sync for service config
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.577788     794 config.go:226] "Starting endpoint slice config controller"
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.577811     794 shared_informer.go:255] Waiting for caches to sync for endpoint slice config
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.585026     794 config.go:444] "Starting node config controller"
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.587955     794 shared_informer.go:255] Waiting for caches to sync for node config
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.693844     794 shared_informer.go:262] Caches are synced for endpoint slice config
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.697194     794 shared_informer.go:262] Caches are synced for service config
Dez 14 21:51:10 worker-1 kube-proxy[794]: I1214 21:51:10.703662     794 shared_informer.go:262] Caches are synced for node config

Any idea what's wrong here?

Regards, Sven

guillaumelauzier commented 1 year ago

It looks like the coredns pods are failing to start because they are unable to connect to the Kubernetes API server. This could be due to a network issue, or an issue with the configuration of the coredns pods.

One possible solution is to check the logs of the coredns pods to see if there is more detailed information about the error. You can do this by running the following command:

kubectl logs -n kube-system coredns-<POD-ID>

Replace with the actual ID of the coredns pod that is failing. This should give you more information about the cause of the error.

Additionally, you can try restarting the coredns pods to see if that fixes the issue. You can do this by running the following command:

kubectl delete pod -n kube-system coredns-<POD-ID>

Again, replace with the actual ID of the coredns pod. This will delete the failing pod, and the Kubernetes cluster will automatically create a new one in its place.

sven-borkert commented 1 year ago

Hi,

yes, the coredns pods are starting, but not going "ready" because they cannot reach the cluster ip of the Kubernetes API server. I deleted one of the pods and checked the logs of the newly created pod:

$ kubectl logs coredns-7c9cfc6995-snvgp -n kube-system -f
plugin/kubernetes: Get "https://10.32.0.1:443/version?timeout=32s": dial tcp 10.32.0.1:443: i/o timeout

My networking seems to be broken, and I don't see the 10.32.0.1 in the iptables rules on the worker nodes. I think that kube-proxy should create a rule that catches connections to that virtual IP and forwards them to the controller nodes running the kube-apiservers, right? I checked the logs of kube-proxy on the nodes and I don't see any errors.

In the iptables rules of the workers I see it has created a rule for kube-dns, but this has no targets yet as it rejected as the coredns pods don't start correctly:

-A KUBE-SERVICES -d 10.32.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns has no endpoints" -m udp --dport 53 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 10.32.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp has no endpoints" -m tcp --dport 53 -j REJECT --reject-with icmp-port-unreachable
-A KUBE-SERVICES -d 10.32.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics has no endpoints" -m tcp --dport 9153 -j REJECT --reject-with icmp-port-unreachable

From my understanding, I would have expected a rule here that filters on destination 10.32.0.1:443, right? But for some reason there is no rule like that, so the service it not reachable.

Besides that, my container networking seems to be fully broken. I have checked the tutorial multiple times but did not find the error yet. I have installed the cni-plugins-linux to /opt/cni/bin/ and created /etc/cni/net.d/10-bridge.conf and 99-loopback.conf:

root@worker-0:/etc/cni/net.d# ls -l
total 8
-rw-r--r-- 1 root root 303 Dez  1 19:31 10-bridge.conf
-rw-r--r-- 1 root root  72 Dez  1 19:32 99-loopback.conf
root@worker-0:/etc/cni/net.d# cat *
{
    "cniVersion": "0.4.0",
    "name": "bridge",
    "type": "bridge",
    "bridge": "cnio0",
    "isGateway": true,
    "ipMasq": true,
    "ipam": {
        "type": "host-local",
        "ranges": [
          [{"subnet": "10.200.0.0/24"}]
        ],
        "routes": [{"dst": "0.0.0.0/0"}]
    }
}
{
    "cniVersion": "0.4.0",
    "name": "lo",
    "type": "loopback"
}

I verified they each have their own subnet 10.200.0.0/24, 10.200.1.0/24, 10.200.2.0/24.

I can see the expected bridge interface cnio0 on the worker nodes:

root@worker-0:~# ifconfig 
cnio0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.200.0.1  netmask 255.255.255.0  broadcast 10.200.0.255
        inet6 fe80::f0aa:c0ff:fe70:2040  prefixlen 64  scopeid 0x20<link>
        ether 4a:55:d3:bc:d7:b6  txqueuelen 1000  (Ethernet)
        RX packets 230  bytes 13102 (13.1 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 30  bytes 1588 (1.5 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.220  netmask 255.255.255.0  broadcast 192.168.0.255
        inet6 fe80::250:56ff:fe3b:abcb  prefixlen 64  scopeid 0x20<link>
        ether 00:50:56:3b:ab:cb  txqueuelen 1000  (Ethernet)
        RX packets 8906  bytes 2294242 (2.2 MB)
        RX errors 0  dropped 1840  overruns 0  frame 0
        TX packets 4248  bytes 616743 (616.7 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens34: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.240.0.20  netmask 255.255.255.0  broadcast 10.240.0.255
        inet6 fe80::250:56ff:fe38:d826  prefixlen 64  scopeid 0x20<link>
        ether 00:50:56:38:d8:26  txqueuelen 1000  (Ethernet)
        RX packets 725  bytes 91678 (91.6 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1005  bytes 84992 (84.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 2930  bytes 196926 (196.9 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2930  bytes 196926 (196.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth74645b5c: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::7c22:c3ff:fe57:87ee  prefixlen 64  scopeid 0x20<link>
        ether 6a:70:f9:6f:be:48  txqueuelen 0  (Ethernet)
        RX packets 19  bytes 1424 (1.4 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 40  bytes 3076 (3.0 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth857e8004: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::7cfb:c3ff:fe59:9c08  prefixlen 64  scopeid 0x20<link>
        ether 12:30:8f:a8:f2:3c  txqueuelen 0  (Ethernet)
        RX packets 120  bytes 8408 (8.4 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 48  bytes 3180 (3.1 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

The routes on the worker look like this:

root@worker-0:~# ip route
default via 192.168.0.1 dev ens33 proto dhcp src 192.168.0.220 metric 100 
10.200.0.0/24 via 10.240.0.20 dev ens34 proto static 
10.200.0.0/24 dev cnio0 proto kernel scope link src 10.200.0.1 
10.200.1.0/24 via 10.240.0.21 dev ens34 proto static 
10.200.2.0/24 via 10.240.0.22 dev ens34 proto static 
10.240.0.0/24 dev ens34 proto kernel scope link src 10.240.0.20 
192.168.0.0/24 dev ens33 proto kernel scope link src 192.168.0.220 metric 100 
192.168.0.1 dev ens33 proto dhcp scope link src 192.168.0.220 metric 100 

I started a "busybox" pod on worker-0 to check the network connection. It has it's interface and an IP from the correct subnet. But it's not even able to ping the gateway, and nothing else:

$ kubectl exec -ti busybox -- /bin/sh
/ # ifconfig 
eth0      Link encap:Ethernet  HWaddr BA:96:88:8C:34:EE  
          inet addr:10.200.0.55  Bcast:10.200.0.255  Mask:255.255.255.0
          inet6 addr: fe80::b896:88ff:fe8c:34ee/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:41 errors:0 dropped:0 overruns:0 frame:0
          TX packets:22 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:3118 (3.0 KiB)  TX bytes:1662 (1.6 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

/ # ip route
default via 10.200.0.1 dev eth0 
10.200.0.0/24 dev eth0 scope link  src 10.200.0.55 
/ # ping 10.200.0.1
PING 10.200.0.1 (10.200.0.1): 56 data bytes

Thank you for any hints that might me help to understand this. Regards, Sven

sven-borkert commented 1 year ago

Aaaaaah! Sometimes it's good to write the details to someone else. My routes are wrong. I fixed the routing and the pods went healthy. Not sure if everything works now, but I'm one step further. :)

saeed0808 commented 1 year ago

perfect you are rocking.

On Fri, Dec 16, 2022 at 6:56 PM Sven Borkert @.***> wrote:

Aaaaaah! Sometimes it's good to write the details to someone else. My routes are wrong. I fixed the routing and the pods went healthy. Not sure if everything works now, but I'm one step further. :)

— Reply to this email directly, view it on GitHub https://github.com/kelseyhightower/kubernetes-the-hard-way/issues/725#issuecomment-1354763212, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALCT4Z5O3IU5YBEZ72VWLWTWNRUY5ANCNFSM6AAAAAAS7BDOXQ . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>

sven-borkert commented 1 year ago

All the tests from the tutorial are working and the containers can reach each other, nice.

I did this installation on Ubuntu 22.04. It seems to be a good idea (easier) to use the containerd and runc that come with this Ubuntu version, the manually installed version from this tutorial seem to be unhappy with cgroup v2. (I know this tutorial is meant for an older Ubuntu version)

The coredns does resolve the name "kubernetes", and after I added a forward to it's configuration is also resolves external IPs. But it does not seem to resolve any pod names. Shouldn't it?

In other tutorials I always read I would need a CNI provider like "Calico" for the networking between the pods. The package cni-plugins-linux is not Calico as far as I understand, so this tutorial does not install Calico. What do I need it for then?

Regards, Sven

justizin commented 1 year ago

Ran into this in the current version of the tutorial as of today. I was able to resolve it by following instructions to modify the coredns config to point at 1.8, but also kubectl apply -f deployments/kube-dns.yaml. not sure if this puts things in an optimal state, but i wasn't able to resolve with only one applied, and with both it works fine.