Closed ThorbenJ closed 2 months ago
Seems a dupe of https://github.com/metallb/metallb/issues/253 , well described in https://github.com/metallb/metallb/issues/253#issuecomment-1098300839
Closing as not fixable in MetalLB itself, see also https://github.com/metallb/metallb/issues/535
EDIT: While this works, enabling hairpin appears to leads to network storms, so this is not a complete solution; unfortunately.
Been looking at this and here is my solution: On the physical (metal) nic where you expect to get connections from enable hairpin mode (allows packet/frames to leave the port they entered) and enable proxy_arp (allow the port to answer for other's mac):
ip link set dev enP3p49s0 type bridge_slave proxy_arp on hairpin on
Before:
curl -k https://172.17.43.1/
curl: (7) Failed to connect to 172.17.43.1 port 443 after 3032 ms: Couldn't connect to server
After:
curl -k https://172.17.43.1/
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "Unauthorized",
"reason": "Unauthorized",
"code": 401
}
Adding this comment for others to find, maybe the docs could be amended?
MetalLB Version
0.14.5
Deployment method
Manifests
Main CNI
bridge
Kubernetes Version
v1.30.2+k3s1
Cluster Distribution
k3s
Describe the bug
I am using the CNI bridge plugin (https://www.cni.dev/plugins/current/main/bridge/) connected to a linux (6.8.11) vlan aware bridge. In normal operation the MetalLB manged IP service can be reached from other k8s node host OSes, but from outside the cluster, not even from the lan router. However when I put the bridge on the node currently assigned for L2 advertisements into promisc mode it starts to work! I found this out by accident when I attached tcpdump to the bridge, and varified it with
ip link set bridge promisc on
afterwards. In interestingly I see the L2 advertisements on the router, but no packets come back unless the bridge is in promisc mode.I don't think its a good idea to put all bridges on all nodes into promisc mode. I tried turning various things with ethtool off, no luck.
To Reproduce
From other node curl -k https://172.17.43.1/api -> works From router (openwrt) and beyond curl -k https://172.17.43.1/api -> Only works when bridge is in promisc
The service:
MetalLB config:
/e/n/interfaces (snippet):
/etc/cni/net.d/10-front-bridge-cni.conf
Native vlan is the host/node network a /23, the first /24 for hosts then second /24 for MetalLB vlan 43 connects pods across all nodes.
The privat/privat class E pod/svc network may not leave the host:
-A POSTROUTING -s 240.0.0.0/4 ! -d 240.0.0.0/4 -j MASQUERADE
Expected Behavior
Always to work, even when bridge not in promisc mode
Additional Context
Tried search docs for "bridge" looked at issues that mentioned "bridge" but found nothing that appeared to apply/help. Tried explicitly setting the interface one to the bridge, another time to the physical nic.
(Honesty: I only tried the most recently tagged MetalLB, as noted above, not from master)
I've read and agree with the following
I've read and agree with the following