canonical / microk8s

MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge.
https://microk8s.io
Apache License 2.0
8.51k stars 772 forks source link

LoadBalancer unreachable outside of cluster #3034

Closed sebastienle14 closed 9 months ago

sebastienle14 commented 2 years ago

greetings,

My setup is as-is: On 192.168.100.0/24 network, I got an ubuntu 21.10 host

6: net100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 6e:43:9d:8e:6b:c2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.56/24 brd 192.168.100.255 scope global net100
       valid_lft forever preferred_lft forever
    inet6 fe80::6c43:9dff:fe8e:6bc2/64 scope link
       valid_lft forever preferred_lft forever

This host runs 4 VMs:

 Id   Name           State
------------------------------
 11   eck-worker01   running
 12   eck-worker02   running
 13   eck-worker03   running
 14   eck-master     running

All VMs are using bridged interface to net100, in order to be accessible from 192.168.100.0/24 without routing/NATing.

eck-master uses 192.168.100.70
eck-worker01 uses 192.168.100.71
eck-worker02 uses 192.168.100.72
eck-worker03 uses 192.168.100.73

I have enabled dns, storage and metallb modules, from a 1.23.5 release (running with ubuntu 21.10) metallb range is from 192.168.100.100 to 192.168.100.109

](Within the cluster, everything seems fine:

root@eck-master:~# curl https://192.168.100.100/app/home#/ --insecure
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>

Outside of it, there is no route to host.

root@host~# curl https://192.168.100.100/app/home#/ --insecure
curl: (7) Failed to connect to 192.168.100.100 port 443: No route to host

From the host itself, handling the VMs:

IPTables are, on host:

root@host:~# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
LIBVIRT_INP  all  --  anywhere             anywhere

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
LIBVIRT_FWX  all  --  anywhere             anywhere
LIBVIRT_FWI  all  --  anywhere             anywhere
LIBVIRT_FWO  all  --  anywhere             anywhere

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
LIBVIRT_OUT  all  --  anywhere             anywhere

Chain LIBVIRT_FWI (1 references)
target     prot opt source               destination
ACCEPT     all  --  anywhere             192.168.122.0/24     ctstate RELATED,ESTABLISHED
REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable

Chain LIBVIRT_FWO (1 references)
target     prot opt source               destination
ACCEPT     all  --  192.168.122.0/24     anywhere
REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable

Chain LIBVIRT_FWX (1 references)
target     prot opt source               destination
ACCEPT     all  --  anywhere             anywhere

Chain LIBVIRT_INP (1 references)
target     prot opt source               destination
ACCEPT     udp  --  anywhere             anywhere             udp dpt:domain
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:domain
ACCEPT     udp  --  anywhere             anywhere             udp dpt:bootps
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:67

Chain LIBVIRT_OUT (1 references)
target     prot opt source               destination
ACCEPT     udp  --  anywhere             anywhere             udp dpt:domain
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:domain
ACCEPT     udp  --  anywhere             anywhere             udp dpt:bootpc
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:68

on VMs:

root@eck-master:~# iptables -L
# Warning: iptables-legacy tables present, use iptables-legacy to see them
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  10.1.0.0/16          anywhere             /* generated for MicroK8s pods */
ACCEPT     all  --  anywhere             10.1.0.0/16          /* generated for MicroK8s pods */

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

here are the different nmap results:

root@host:~#  nmap -p1-65535 192.168.100.100
Starting Nmap 7.80 ( https://nmap.org ) at 2022-04-06 12:46 CEST
Nmap scan report for 192.168.100.100
Host is up (0.00052s latency).
All 65535 scanned ports on 192.168.100.100 are filtered
MAC Address: 52:54:00:A8:16:6A (QEMU virtual NIC)

Nmap done: 1 IP address (1 host up) scanned in 1313.70 seconds
root@eck-master:~# nmap -p1-65535 192.168.100.100
Starting Nmap 7.80 ( https://nmap.org ) at 2022-04-06 10:36 UTC
Host is up (0.00074s latency).
All 65535 scanned ports on elk.geoconcept.com (192.168.100.100) are filtered
MAC Address: 52:54:00:A8:16:6A (QEMU virtual NIC)

Nmap done: 1 IP address (1 host up) scanned in 1313.61 seconds

Using kubectl port forward from the host is working.

Have I done something wrong with this setup? What would you suggest for me to be digging?

Best Regards,

inspection-report-20220406_145730.tar.gz

neoaggelos commented 2 years ago

Hi @sebastienle14, thank you for filing the issue.

What does the routing table on your host look like? Also, I assume you can ping/curl the addresses of the VMs directly? E.g. 192.168.100.71?

sebastienle14 commented 2 years ago

greetings @neoaggelos

Indeed, I can:

root@host:~# ip route show
default via 192.168.100.1 dev net100 proto static
192.168.100.0/24 dev net100 proto kernel scope link src 192.168.100.56
192.168.102.0/24 dev net102 proto kernel scope link src 192.168.102.4
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown

Here is the VM launch command: libvirt+ 15597 21.8 1.1 2784320 2209108 ? Sl 11:37 77:34 /usr/bin/qemu-system-x86_64 -name guest=eck-master,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-14-eck-master/master-key.aes"} -machine pc-i440fx-impish,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram -cpu EPYC-Rome,x2apic=on,tsc-deadline=on,hypervisor=on,tsc-adjust=on,spec-ctrl=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,cmp-legacy=on,ibrs=on,amd-ssbd=on,virt-ssbd=on,svme-addr-chk=on,rdctl-no=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on -m 2048 -object {"qom-type":"memory-backend-ram","id":"pc.ram","size":2147483648} -overcommit mem-lock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 139fe967-bab8-4063-b631-c8d8511d9b39 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=43,server=on,wait=off -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -blockdev {"driver":"file","filename":"/srv/vms/eck-master.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":null} -device ide-hd,bus=ide.0,unit=0,drive=libvirt-2-format,id=ide0-0-0,bootindex=1 -device ide-cd,bus=ide.0,unit=1,id=ide0-0-1 -netdev tap,fd=48,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=52:54:00:9f:b0:4d,bus=pci.0,addr=0x3 -netdev tap,fd=49,id=hostnet1 -device e1000,netdev=hostnet1,id=net1,mac=52:54:00:ce:4e:bc,bus=pci.0,addr=0x4 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -audiodev id=audio1,driver=spice -spice port=5903,addr=127.0.0.1,disable-ticketing=on,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x5 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0,audiodev=audio1 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on

sebastienle14 commented 2 years ago

I believe this output shall be necessary too, sorry I forgot to add it in OP:

root@eck-master:~# kubectl get svc -o wide -n eck
NAME                                 TYPE           CLUSTER-IP       EXTERNAL-IP       PORT(S)                      AGE    SELECTOR
elasticsearch-es-transport           ClusterIP      None             <none>            9300/TCP                     26h    common.k8s.elastic.co/type=elasticsearch,elasticsearch.k8s.elastic.co/cluster-name=elasticsearch
elasticsearch-es-http                ClusterIP      10.152.183.14    <none>            9200/TCP                     26h    common.k8s.elastic.co/type=elasticsearch,elasticsearch.k8s.elastic.co/cluster-name=elasticsearch
elasticsearch-es-internal-http       ClusterIP      10.152.183.87    <none>            9200/TCP                     26h    common.k8s.elastic.co/type=elasticsearch,elasticsearch.k8s.elastic.co/cluster-name=elasticsearch
kibana-kb-http                       ClusterIP      10.152.183.108   <none>            5601/TCP                     26h    common.k8s.elastic.co/type=kibana,kibana.k8s.elastic.co/name=kibana
elasticsearch-es-default             ClusterIP      None             <none>            9200/TCP                     26h    common.k8s.elastic.co/type=elasticsearch,elasticsearch.k8s.elastic.co/cluster-name=elasticsearch,elasticsearch.k8s.elastic.co/statefulset-name=elasticsearch-es-default
ingress-nginx-controller-admission   ClusterIP      10.152.183.98    <none>            443/TCP                      26h    app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
cert-manager                         ClusterIP      10.152.183.196   <none>            9402/TCP                     26h    app.kubernetes.io/component=controller,app.kubernetes.io/instance=cert-manager,app.kubernetes.io/name=cert-manager
cert-manager-webhook                 ClusterIP      10.152.183.30    <none>            443/TCP                      26h    app.kubernetes.io/component=webhook,app.kubernetes.io/instance=cert-manager,app.kubernetes.io/name=webhook
nginx-elk                            ClusterIP      10.152.183.123   <none>            443/TCP                      26h    app=nginx-elk,release=nginx-elk
fleet-server-agent-http              ClusterIP      10.152.183.130   <none>            8220/TCP                     26h    agent.k8s.elastic.co/name=fleet-server,common.k8s.elastic.co/type=agent
logstash-prod-logstash-headless      ClusterIP      None             <none>            9600/TCP                     5h7m   app=logstash-prod-logstash
logstash-prod-logstash               LoadBalancer   10.152.183.15    192.168.100.101   5044:31489/TCP               5h7m   app=logstash-prod-logstash,chart=logstash,release=logstash-prod
ingress-nginx-controller             LoadBalancer   10.152.183.92    192.168.100.100   80:32092/TCP,443:30932/TCP   26h    app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
neoaggelos commented 2 years ago

I am unable to reproduce this unfortunately, using a similar setup (libvirt hosts, bridged networking), the LB IP is reachable as it should.

If you use a NodePort service, can you reach your service from the host?

What is the networking equipment? Is this on a cloud environment? What does the routing table look like in the VMS? Can you use tcpdump and see if traffic reaches any of the VMs (though this seems unlikely due to the ? Can you double check the firewall rules in the ho

Do you see any warning logs in the metallb pods?

sebastienle14 commented 2 years ago

Our Hosting team confirms all networking the host is plugged to are managed by a CISCO Nexus 3064-T, 48 x 10GBase appliance, which should be manageable but they know nothing about yet (they are waiting for the course on this matter). At the moment, this appliance is managed by one of our partners, and not ourselves.

I will setup tcpdump on both sides (host & VMs) today and will get back to you.

We tried with nodePort service, and the same issue occured.

root@eck-master:~# kubectl -n eck get svc

NAME                                 TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
<snipped>
ingress-nginx-controller             NodePort       10.152.183.92    <none>        80:32092/TCP,443:30932/TCP   21h

root@host ~# nmap -p1-65535 192.168.100.70
Starting Nmap 7.80 ( https://nmap.org ) at 2022-04-06 10:31 UTC
Nmap scan report for eck-master (192.168.100.70)
Host is up (0.0000070s latency).
Not shown: 65523 closed ports
PORT      STATE    SERVICE
22/tcp    open     ssh
111/tcp   open     rpcbind
10250/tcp open     unknown
10255/tcp open     unknown
10257/tcp open     unknown
10259/tcp open     unknown
16443/tcp open     unknown
19001/tcp open     unknown
25000/tcp open     icl-twobase1
30932/tcp filtered unknown
32092/tcp filtered unknown
32381/tcp filtered unknown

Here are the routing tables for the VMs:

root@eck-master:~# ip route show
default via 192.168.100.1 dev ens3 proto static
10.1.97.128/26 via 10.1.97.128 dev vxlan.calico onlink
10.1.102.0/26 via 10.1.102.0 dev vxlan.calico onlink
10.1.118.208 dev cali4790718892a scope link
10.1.118.209 dev calie0d8cbdd008 scope link
10.1.194.128/26 via 10.1.194.128 dev vxlan.calico onlink
192.168.100.0/24 dev ens3 proto kernel scope link src 192.168.100.70
192.168.102.0/24 dev ens4 proto kernel scope link src 192.168.102.70

root@eck-worker01:~# ip route show
default via 192.168.100.1 dev ens3 proto static
10.1.97.128/26 via 10.1.97.128 dev vxlan.calico onlink
10.1.102.0/26 via 10.1.102.0 dev vxlan.calico onlink
10.1.118.192/26 via 10.1.118.192 dev vxlan.calico onlink
10.1.194.129 dev cali736072e4537 scope link
10.1.194.131 dev calia28b2c14776 scope link
10.1.194.132 dev cali34e6a0f39f4 scope link
10.1.194.134 dev cali0ebfa45a26d scope link
10.1.194.135 dev cali5153b943d31 scope link
10.1.194.136 dev calic4101f61c36 scope link
10.1.194.137 dev calic9ba359cc71 scope link
10.1.194.138 dev cali79a8d91cda9 scope link
10.1.194.142 dev calibb1f5dc25d1 scope link
192.168.100.0/24 dev ens3 proto kernel scope link src 192.168.100.71
192.168.102.0/24 dev ens4 proto kernel scope link src 192.168.102.71

root@eck-worker02:~# ip route show
default via 192.168.100.1 dev ens3 proto static
10.1.97.145 dev cali24e82f0ac32 scope link
10.1.97.147 dev cali042c4949e87 scope link
10.1.97.148 dev cali69c374ce9a6 scope link
10.1.97.154 dev cali8b67281a788 scope link
10.1.97.155 dev cali3ab98e7ef56 scope link
10.1.97.156 dev cali0d440ae4e03 scope link
10.1.97.157 dev cali9bbe8a269ad scope link
10.1.97.162 dev cali9382a4fd853 scope link
10.1.97.164 dev cali392ae983540 scope link
10.1.102.0/26 via 10.1.102.0 dev vxlan.calico onlink
10.1.118.192/26 via 10.1.118.192 dev vxlan.calico onlink
10.1.194.128/26 via 10.1.194.128 dev vxlan.calico onlink
192.168.100.0/24 dev ens3 proto kernel scope link src 192.168.100.72
192.168.102.0/24 dev ens4 proto kernel scope link src 192.168.102.72

root@eck-worker03:~# ip route show
default via 192.168.100.1 dev ens3 proto static
10.1.97.128/26 via 10.1.97.128 dev vxlan.calico onlink
10.1.102.3 dev cali88e0e4ee76f scope link
10.1.102.8 dev cali52f2043b50e scope link
10.1.102.9 dev caliac9d8debdd8 scope link
10.1.102.10 dev cali63b96077834 scope link
10.1.118.192/26 via 10.1.118.192 dev vxlan.calico onlink
10.1.194.128/26 via 10.1.194.128 dev vxlan.calico onlink
192.168.100.0/24 dev ens3 proto kernel scope link src 192.168.100.73
192.168.102.0/24 dev ens4 proto kernel scope link src 192.168.102.73

And this is the full iptables-save from host:

# Generated by iptables-save v1.8.7 on Thu Apr  7 09:30:08 2022
*mangle
:PREROUTING ACCEPT [433177972:332515709504]
:INPUT ACCEPT [433176194:332515609320]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [285353385:132902329149]
:POSTROUTING ACCEPT [285353385:132902329149]
:LIBVIRT_PRT - [0:0]
-A POSTROUTING -j LIBVIRT_PRT
-A LIBVIRT_PRT -o virbr0 -p udp -m udp --dport 68 -j CHECKSUM --checksum-fill
COMMIT
# Completed on Thu Apr  7 09:30:08 2022
# Generated by iptables-save v1.8.7 on Thu Apr  7 09:30:08 2022
*raw
:PREROUTING ACCEPT [433177972:332515709504]
:OUTPUT ACCEPT [285353385:132902329149]
COMMIT
# Completed on Thu Apr  7 09:30:08 2022
# Generated by iptables-save v1.8.7 on Thu Apr  7 09:30:08 2022
*filter
:INPUT ACCEPT [433176194:332515609320]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [285353385:132902329149]
:LIBVIRT_FWI - [0:0]
:LIBVIRT_FWO - [0:0]
:LIBVIRT_FWX - [0:0]
:LIBVIRT_INP - [0:0]
:LIBVIRT_OUT - [0:0]
-A INPUT -j LIBVIRT_INP
-A FORWARD -j LIBVIRT_FWX
-A FORWARD -j LIBVIRT_FWI
-A FORWARD -j LIBVIRT_FWO
-A OUTPUT -j LIBVIRT_OUT
-A LIBVIRT_FWI -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A LIBVIRT_FWI -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A LIBVIRT_FWO -s 192.168.122.0/24 -i virbr0 -j ACCEPT
-A LIBVIRT_FWO -i virbr0 -j REJECT --reject-with icmp-port-unreachable
-A LIBVIRT_FWX -i virbr0 -o virbr0 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p udp -m udp --dport 67 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p udp -m udp --dport 68 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p tcp -m tcp --dport 68 -j ACCEPT
COMMIT
# Completed on Thu Apr  7 09:30:08 2022
# Generated by iptables-save v1.8.7 on Thu Apr  7 09:30:08 2022
*nat
:PREROUTING ACCEPT [1270779:76548008]
:INPUT ACCEPT [1269001:76447824]
:OUTPUT ACCEPT [274161:12540602]
:POSTROUTING ACCEPT [274161:12540602]
:LIBVIRT_PRT - [0:0]
-A POSTROUTING -j LIBVIRT_PRT
-A LIBVIRT_PRT -s 192.168.122.0/24 -d 224.0.0.0/24 -j RETURN
-A LIBVIRT_PRT -s 192.168.122.0/24 -d 255.255.255.255/32 -j RETURN
-A LIBVIRT_PRT -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p tcp -j MASQUERADE --to-ports 1024-65535
-A LIBVIRT_PRT -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p udp -j MASQUERADE --to-ports 1024-65535
-A LIBVIRT_PRT -s 192.168.122.0/24 ! -d 192.168.122.0/24 -j MASQUERADE
-A LIBVIRT_PRT -s 192.168.122.0/24 -d 224.0.0.0/24 -j RETURN
-A LIBVIRT_PRT -s 192.168.122.0/24 -d 255.255.255.255/32 -j RETURN
-A LIBVIRT_PRT -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p tcp -j MASQUERADE --to-ports 1024-65535
-A LIBVIRT_PRT -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p udp -j MASQUERADE --to-ports 1024-65535
-A LIBVIRT_PRT -s 192.168.122.0/24 ! -d 192.168.122.0/24 -j MASQUERADE
COMMIT
# Completed on Thu Apr  7 09:30:08 2022

the only reccurent log I see on the metallb-controller is the following line: W0406 21:22:16.792009 1 reflector.go:302] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: watch of *v1.ConfigMap ended with: too old resource version: 10848161 (10849179)

But there's no remarkable warnings about the issue we're discussing, so it seems.

Best Regards,

neoaggelos commented 2 years ago

Yes, the routing tables look good, at this point I think tcpdump is needed to point out what may be going wrong.

We tried with nodePort service, and the same issue occured.

I see the nmap, can you also try a curl?

sebastienle14 commented 2 years ago

I see the nmap, can you also try a curl?

Here it is:

root@host:~# curl https://192.168.100.100:30932/ --insecure
curl: (28) Failed to connect to 192.168.100.100 port 30932: Connection timed out
sebastienle14 commented 2 years ago

The captures are too big to be fit for attachement here, so I create this wetransfer link: https://we.tl/t-p8xP4Zkyfy

On host, I used: tcpdump -i any -n not tcp port 22 -w /tmp/net100_lb_all_traffic2.pcap

On VM, I used: tcpdump -i any -n 'not tcp port 22' -w /tmp/vm_incoming3.pcap

My wireshark filter is as follow: not ip.dst_host==192.168.102.4 and not ip.src_host==192.168.102.4 and (ip.dst_host==192.168.100.100 or ip.src_host==192.168.100.100)

Host_Capture

VM_Capture

Seems I'm getting ICMP response unreachable. I also see some STP traces from the Cisco appliance.

I have to admit my networking skills are a little rusty and I cannot read these traces as easily as my younger self could.

neoaggelos commented 2 years ago

I see the nmap, can you also try a curl?

Here it is:

root@host:~# curl https://192.168.100.100:30932/ --insecure
curl: (28) Failed to connect to 192.168.100.100 port 30932: Connection timed out

How about curl https://192.168.100.70:30932 (on the IPs of the nodes directly)?

sebastienle14 commented 2 years ago

How about curl https://192.168.100.70:30932 (on the IPs of the nodes directly)?

From the node itself:

root@eck-master:~# curl https://192.168.100.70:30932 --insecure
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

From the host, it stalls indefinitively (so far, I'll let it TO):

root@host:~# curl https://192.168.100.70:30932 --insecure -vvv
*   Trying 192.168.100.70:30932...
* connect to 192.168.100.70 port 30932 failed: Connection timed out
* Failed to connect to 192.168.100.70 port 30932: Connection timed out
* Closing connection 0
curl: (28) Failed to connect to 192.168.100.70 port 30932: Connection timed out

( on another term, at the same time: )
root@host:~# nmap -p1-65535 192.168.100.70 -Pn
Starting Nmap 7.80 ( https://nmap.org ) at 2022-04-07 15:08 CEST
Nmap scan report for 192.168.100.70
Host is up (0.00052s latency).
Not shown: 65520 closed ports
PORT      STATE    SERVICE
22/tcp    open     ssh
111/tcp   open     rpcbind
7472/tcp  open     unknown
7946/tcp  open     unknown
10250/tcp open     unknown
10255/tcp open     unknown
10257/tcp open     unknown
10259/tcp open     unknown
16443/tcp open     unknown
19001/tcp open     unknown
25000/tcp open     icl-twobase1
30418/tcp open     unknown
30932/tcp filtered unknown
31489/tcp filtered unknown
32092/tcp filtered unknown
MAC Address: 52:54:00:9F:B0:4D (QEMU virtual NIC)

Nmap done: 1 IP address (1 host up) scanned in 3.24 seconds
root@host:~#

from the other nodes, it also stall:

root@eck-worker01:~# curl https://192.168.100.70:30932 --insecure
curl: (28) Failed to connect to 192.168.100.70 port 30932: Connection timed out

root@eck-worker01:~# curl https://192.168.100.100:443 --insecure
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

EDIT: tcpdump things are on my comment above

sebastienle14 commented 2 years ago

as a follow-up:

cross-nodes, the curl on 30932 is not working, but it works on each node querying itself

root@eck-master:~# kubectl -n eck get svc
NAME                                 TYPE           CLUSTER-IP       EXTERNAL-IP       PORT(S)                      AGE
<snipped>
logstash-prod-logstash               LoadBalancer   10.152.183.15    192.168.100.101   5044:31489/TCP               26h
ingress-nginx-controller             LoadBalancer   10.152.183.92    192.168.100.100   80:32092/TCP,443:30932/TCP   2d
root@eck-master:~# curl https://192.168.100.70:30932 --insecure
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

root@eck-worker01:~#  curl https://192.168.100.71:30932 --insecure
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

root@eck-worker03:~#  curl https://192.168.100.73:30932 --insecure
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>
neoaggelos commented 2 years ago

as a follow-up:

cross-nodes, the curl on 30932 is not working, but it works on each node querying itself

root@eck-master:~# kubectl -n eck get svc
NAME                                 TYPE           CLUSTER-IP       EXTERNAL-IP       PORT(S)                      AGE
<snipped>
logstash-prod-logstash               LoadBalancer   10.152.183.15    192.168.100.101   5044:31489/TCP               26h
ingress-nginx-controller             LoadBalancer   10.152.183.92    192.168.100.100   80:32092/TCP,443:30932/TCP   2d
root@eck-master:~# curl https://192.168.100.70:30932 --insecure
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

root@eck-worker01:~#  curl https://192.168.100.71:30932 --insecure
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

root@eck-worker03:~#  curl https://192.168.100.73:30932 --insecure
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

Okay, judging from that I think the most possible scenario is that something is cutting off network traffic in high ports (30000-32XXX). The issues you are facing with the loadbalancer also stem from that.

Can you test the same scenario, but using the local bridge instead? (that would be 192.168.122.0/24, virbr0). If the issue is with MicroK8s, I would expect the error to persist.

sebastienle14 commented 2 years ago

I will try & setup the cluster accordinly, but it might take some time; I do not know if the ticket may be put on hold until monday or so

best regards,

sebastienle14 commented 2 years ago

Greetings,

So we tried to set-up a POC environnement, and so far, we failed to even deploy the LoadBalancer (keeps in Pending Status)

poc-master on virbr0 with 192.168.122.193 poc-worker01 on virbr0 with 192.168.122.177 poc-worker02 on virbr0 with 192.168.122.131

We found out that one on the calico component was failing. We are not sure if it's linked or not to our issue. What's your advice?

We tried to recreate the poc-cluster from scratch, without any deployment, and calico-kube-controllers is still in CrashLoopBack state

Here is what we did on poc-master:

root@poc-master:~# snap install microk8s --classic --channel=latest/stable
microk8s v1.23.5 from Canonical✓ installed
root@poc-master:~# usermod -a -G microk8s $USER
root@poc-master:~# chown -f -R $USER ~/.kube
root@poc-master:~# microk8s status --wait-ready
microk8s is running
high-availability: no
  datastore master nodes: 127.0.0.1:19001
  datastore standby nodes: none
addons:
  enabled:
    ha-cluster           # Configure high availability on the current node
  disabled:
    ambassador           # Ambassador API Gateway and Ingress
    cilium               # SDN, fast with full network policy
    dashboard            # The Kubernetes dashboard
    dashboard-ingress    # Ingress definition for Kubernetes dashboard
    dns                  # CoreDNS
    fluentd              # Elasticsearch-Fluentd-Kibana logging and monitoring
    gpu                  # Automatic enablement of Nvidia CUDA
    helm                 # Helm 2 - the package manager for Kubernetes
    helm3                # Helm 3 - Kubernetes package manager
    host-access          # Allow Pods connecting to Host services smoothly
    inaccel              # Simplifying FPGA management in Kubernetes
    ingress              # Ingress controller for external access
    istio                # Core Istio service mesh services
    jaeger               # Kubernetes Jaeger operator with its simple config
    kata                 # Kata Containers is a secure runtime with lightweight VMS
    keda                 # Kubernetes-based Event Driven Autoscaling
    knative              # The Knative framework on Kubernetes.
    kubeflow             # Kubeflow for easy ML deployments
    linkerd              # Linkerd is a service mesh for Kubernetes and other frameworks
    metallb              # Loadbalancer for your Kubernetes cluster
    metrics-server       # K8s Metrics Server for API access to service metrics
    multus               # Multus CNI enables attaching multiple network interfaces to pods
    openebs              # OpenEBS is the open-source storage solution for Kubernetes
    openfaas             # OpenFaaS serverless framework
    portainer            # Portainer UI for your Kubernetes cluster
    prometheus           # Prometheus operator for monitoring and logging
    rbac                 # Role-Based Access Control for authorisation
    registry             # Private image registry exposed on localhost:32000
    storage              # Storage class; allocates storage from host directory
    traefik              # traefik Ingress controller for external access
root@poc-master:~# microk8s enable dns storage
Enabling DNS
Applying manifest
serviceaccount/coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
clusterrole.rbac.authorization.k8s.io/coredns created
clusterrolebinding.rbac.authorization.k8s.io/coredns created
Restarting kubelet
DNS is enabled
Enabling default storage class
deployment.apps/hostpath-provisioner created
storageclass.storage.k8s.io/microk8s-hostpath created
serviceaccount/microk8s-hostpath created
clusterrole.rbac.authorization.k8s.io/microk8s-hostpath created
clusterrolebinding.rbac.authorization.k8s.io/microk8s-hostpath created
Storage will be available soon
oot@poc-master:~# vi /var/snap/microk8s/current/certs/csr.conf.template
root@poc-master:~# microk8s refresh-certs
Taking a backup of the current certificates under /var/snap/microk8s/3052/var/log/ca-backup/
Creating new certificates
Can't load /root/.rnd into RNG
140342700733888:error:2406F079:random number generator:RAND_load_file:Cannot open file:../crypto/rand/randfile.c:88:Filename=/root/.rnd
Can't load /root/.rnd into RNG
140211887805888:error:2406F079:random number generator:RAND_load_file:Cannot open file:../crypto/rand/randfile.c:88:Filename=/root/.rnd
Signature ok
subject=C = GB, ST = Canonical, L = Canonical, O = Canonical, OU = Canonical, CN = 127.0.0.1
Getting CA Private Key
Signature ok
subject=CN = front-proxy-client
Getting CA Private Key
1
Creating new kubeconfig file
Stopped.
Started.

The CA certificates have been replaced. Kubernetes will restart the pods of your workloads.
Any worker nodes you may have in your cluster need to be removed and re-joined to become aware of the new CA.

root@poc-master:~# microk8s enable metallb:192.168.122.200-192.168.122.209
Enabling MetalLB
Applying Metallb manifest
namespace/metallb-system created
secret/memberlist created
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/controller created
podsecuritypolicy.policy/speaker created
serviceaccount/controller created
serviceaccount/speaker created
clusterrole.rbac.authorization.k8s.io/metallb-system:controller created
clusterrole.rbac.authorization.k8s.io/metallb-system:speaker created
role.rbac.authorization.k8s.io/config-watcher created
role.rbac.authorization.k8s.io/pod-lister created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker created
rolebinding.rbac.authorization.k8s.io/config-watcher created
rolebinding.rbac.authorization.k8s.io/pod-lister created
Warning: spec.template.spec.nodeSelector[beta.kubernetes.io/os]: deprecated since v1.14; use "kubernetes.io/os" instead
daemonset.apps/speaker created
deployment.apps/controller created
configmap/config created
MetalLB is enabled
root@poc-master:~# kubectl taint nodes poc-master node-role.kubernetes.io/master=:NoSchedule
node/poc-master tainted
root@poc-master:~# microk8s add-node

On workers, here is what we did:

root@poc-worker01:~# snap install microk8s --classic --channel=latest/stable
microk8s v1.23.5 from Canonical✓ installed
root@poc-worker01:~# microk8s join 192.168.122.193:25000/fc169cbb2003ec5642dd050b9afb11d8/127e916e28e6 --worker
Contacting cluster at 192.168.122.193

The node has joined the cluster and will appear in the nodes list in a few seconds.

Currently this worker node is configured with the following kubernetes API server endpoints:
    - 192.168.122.193 and port 16443, this is the cluster node contacted during the join operation.

If the above endpoints are incorrect, incomplete or if the API servers are behind a loadbalancer please update
/var/snap/microk8s/current/args/traefik/provider.yaml

and here is the result from kubectl from Host:

root@host:~/# kubectl get all --all-namespaces
NAMESPACE        NAME                                           READY   STATUS             RESTARTS        AGE
kube-system      pod/coredns-64c6478b6c-p2qkx                   0/1     Running            1 (5m21s ago)   6m14s
metallb-system   pod/speaker-6vw7v                              1/1     Running            0               3m45s
kube-system      pod/calico-node-nmq6m                          1/1     Running            0               93s
metallb-system   pod/speaker-999mv                              1/1     Running            0               93s
kube-system      pod/hostpath-provisioner-7764447d7c-j7xdj      1/1     Running            0               3m45s
metallb-system   pod/controller-558b7b958-z4gnh                 1/1     Running            0               3m45s
kube-system      pod/calico-node-ps67f                          1/1     Running            0               93s
metallb-system   pod/speaker-p5pch                              1/1     Running            0               63s
kube-system      pod/calico-node-5j4mn                          1/1     Running            0               62s
kube-system      pod/calico-kube-controllers-55bcdcf5c6-ht8dz   0/1     CrashLoopBackOff   8 (23s ago)     6m51s

NAMESPACE     NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
default       service/kubernetes   ClusterIP   10.152.183.1    <none>        443/TCP                  6m57s
kube-system   service/kube-dns     ClusterIP   10.152.183.10   <none>        53/UDP,53/TCP,9153/TCP   6m14s

NAMESPACE        NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                 AGE
metallb-system   daemonset.apps/speaker       3         3         3       3            3           beta.kubernetes.io/os=linux   4m49s
kube-system      daemonset.apps/calico-node   3         3         3       3            3           kubernetes.io/os=linux        6m56s

NAMESPACE        NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
kube-system      deployment.apps/calico-kube-controllers   0/1     1            0           6m56s
kube-system      deployment.apps/coredns                   0/1     1            0           6m14s
kube-system      deployment.apps/hostpath-provisioner      1/1     1            1           6m4s
metallb-system   deployment.apps/controller                1/1     1            1           4m49s

NAMESPACE        NAME                                                 DESIRED   CURRENT   READY   AGE
kube-system      replicaset.apps/calico-kube-controllers-55bcdcf5c6   1         1         0       6m52s
kube-system      replicaset.apps/coredns-64c6478b6c                   1         1         0       6m14s
kube-system      replicaset.apps/hostpath-provisioner-7764447d7c      1         1         1       3m45s
metallb-system   replicaset.apps/controller-558b7b958                 1         1         1       3m45s

And when we tried deploying nginx-ingress, here were the state of the cluster:

NAMESPACE        NAME                                            READY   STATUS    RESTART                            S        AGE
<snipped>
kube-system      pod/calico-kube-controllers-7c6fcdff9f-5hzsf    0/1     Error     13 (32s                             ago)    25h
NAMESPACE     NAME                                         TYPE           CLUSTER-IP                                   EXTERNAL-IP   PORT(S)                     AGE
eck           service/cert-manager-webhook                 ClusterIP      10.152.183.238                               <none>        443/TCP                     11m
eck           service/cert-manager                         ClusterIP      10.152.183.82                                <none>        9402/TCP                    11m
eck           service/ingress-nginx-controller-admission   ClusterIP      10.152.183.14                                <none>        443/TCP                     11m
eck           service/ingress-nginx-controller             LoadBalancer   10.152.183.68                                <pending>    80:31764/TCP,443:31426/TCP   11m
~# kubectl logs calico-kube-controllers-7c6fcdff9f-5hzsf -n kube-system
2022-04-14 09:32:22.052 [INFO][1] main.go 88: Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"kubernetes"}
W0414 09:32:22.054172       1 client_config.go:543] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2022-04-14 09:32:22.055 [INFO][1] main.go 109: Ensuring Calico datastore is initialized
2022-04-14 09:32:32.055 [ERROR][1] client.go 261: Error getting cluster information config ClusterInformation="default" error=Get "https://10.152.183.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": context deadline exceeded
2022-04-14 09:32:32.055 [FATAL][1] main.go 114: Failed to initialize Calico datastore error=Get "https://10.152.183.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": context deadline exceeded

On the former and OP cluster, we do not have issues with calico-controller:

2022-04-14 09:43:49.608 [INFO][1] main.go 88: Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"kubernetes"}
W0414 09:43:49.612469       1 client_config.go:543] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2022-04-14 09:43:49.613 [INFO][1] main.go 109: Ensuring Calico datastore is initialized
2022-04-14 09:43:56.814 [INFO][1] main.go 149: Getting initial config snapshot from datastore
2022-04-14 09:43:56.853 [INFO][1] main.go 152: Got initial config snapshot
2022-04-14 09:43:56.854 [INFO][1] watchersyncer.go 89: Start called
2022-04-14 09:43:56.854 [INFO][1] main.go 169: Starting status report routine
2022-04-14 09:43:56.854 [INFO][1] main.go 402: Starting controller ControllerType="Node"
2022-04-14 09:43:56.854 [INFO][1] node_controller.go 138: Starting Node controller
2022-04-14 09:43:56.854 [INFO][1] watchersyncer.go 127: Sending status update Status=wait-for-ready
2022-04-14 09:43:56.854 [INFO][1] node_syncer.go 40: Node controller syncer status updated: wait-for-ready
2022-04-14 09:43:56.854 [INFO][1] watchersyncer.go 147: Starting main event processing loop
2022-04-14 09:43:56.854 [INFO][1] watchercache.go 174: Full resync is required ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2022-04-14 09:43:56.854 [INFO][1] resources.go 349: Main client watcher loop
2022-04-14 09:43:56.868 [INFO][1] watchercache.go 271: Sending synced update ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2022-04-14 09:43:56.868 [INFO][1] watchersyncer.go 127: Sending status update Status=resync
2022-04-14 09:43:56.868 [INFO][1] node_syncer.go 40: Node controller syncer status updated: resync
2022-04-14 09:43:56.868 [INFO][1] watchersyncer.go 209: Received InSync event from one of the watcher caches
2022-04-14 09:43:56.868 [INFO][1] watchersyncer.go 221: All watchers have sync'd data - sending data and final sync
2022-04-14 09:43:56.868 [INFO][1] watchersyncer.go 127: Sending status update Status=in-sync
2022-04-14 09:43:56.868 [INFO][1] node_syncer.go 40: Node controller syncer status updated: in-sync
2022-04-14 09:43:56.882 [INFO][1] hostendpoints.go 90: successfully synced all hostendpoints
2022-04-14 09:43:56.955 [INFO][1] node_controller.go 151: Node controller is now running
2022-04-14 09:43:56.955 [INFO][1] ipam.go 45: Synchronizing IPAM data
2022-04-14 09:43:56.999 [INFO][1] ipam.go 191: Node and IPAM data is in sync
sebastienle14 commented 2 years ago

@neoaggelos I'll also add our netplan config file from the host to the discussion:

~# cat /etc/netplan/00-installer-config.yaml
network:
  version: 2
  renderer: networkd
  ethernets:
    eno1:
      dhcp4: false
      dhcp6: false
      match:
          macaddress: b0:7b:25:be:ea:da
    enp2s0f0np0:
      dhcp4: false
      dhcp6: false
      match:
          macaddress: 2c:ea:7f:a7:49:3b
  bridges:
    net100:
      interfaces: [eno1]
      mtu: 1500
      addresses: [ 192.168.100.56/24 ]
      nameservers:
          addresses:
             - 109.205.64.35
             - 208.67.222.222
             - 208.67.220.220
          search: []
      routes:
          - to: default
            via: 192.168.100.1
    net102:
      interfaces: [enp2s0f0np0]
      mtu: 1500
      addresses: [ 192.168.102.4/24 ]
sebastienle14 commented 2 years ago

I tried to bypass the cisco appliance by plugging a L2 switch on the NIC, the issue remains.

root@host:~# arping -I net100 192.168.100.100
ARPING 192.168.100.100 from 192.168.100.56 net100
Unicast reply from 192.168.100.100 [52:54:00:A8:16:6A]  0.937ms
Unicast reply from 192.168.100.100 [52:54:00:A8:16:6A]  1.071ms
Unicast reply from 192.168.100.100 [52:54:00:A8:16:6A]  0.983ms
Unicast reply from 192.168.100.100 [52:54:00:A8:16:6A]  1.088ms
Sent 4 probes (1 broadcast(s))

The issue remains, no route to host while trying to reach 192.168.100.100 on 443

sebastienle14 commented 2 years ago

OK,

After further investigation, it seems the ha-cluster module was to blame. Upon disabling it and setting up the cluster again, everything seems fine.

ha-cluster disabled means there is no calico layer deployed anymore, so it seems.

We will run it for a week and will let you know how it goes.

Naegionn commented 1 year ago

Hello, We ran into a similar problem. is there anything known about the Problems of HA configuration with metallb?

stale[bot] commented 10 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.