DHCP replies primary IP of interface

rgruyters commented 4 months ago

When configuring smee and stack to use a secondary IP address for loadbalancing, the replied traffic is sent from the primary IP, not with secondary (loadbalancing, red.) ip.

Expected Behaviour

I would expect kube-vip to reply the traffic from the secondary IP

1719998545.533120 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 382: 10.128.112.161.67 > 10.128.161.133.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 340
1719998545.534029 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 406: 10.128.161.133.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 364
1719998545.534114 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 442: 10.128.161.133.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 400
1719998545.534205 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 424: 10.128.161.133.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 382

Current Behaviour

1719998545.533120 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 382: 10.128.112.161.67 > 10.128.161.133.67: BOOTP/DHCP, Request from xx:xx:xx:xx:xx:xx, length 340
1719998545.534029 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 406: 10.128.161.132.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 364
1719998545.534114 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 442: 10.128.161.132.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 400
1719998545.534205 xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, ethertype IPv4 (0x0800), length 424: 10.128.161.132.435 > 10.128.112.161.67: BOOTP/DHCP, Reply, length 382

Possible Solution

I have no idea

Steps to Reproduce (for bugs)

Deploy Tinkerbell

trusted_proxies=$(kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}' | tr ' ' ',') LB_IP=10.128.161.133 helm install tink-stack charts/tinkerbell/stack --create-namespace --namespace tink-system --wait --set "smee.trustedProxies={${trusted_proxies}}" --set "hegel.trustedProxies={${trusted_proxies}}" --set "stack.loadBalancerIP=$LB_IP" --set "smee.publicIP=$LB_IP"
Request DHCP from a node
watch traffic

Context

Cannot use Tinkerbell service

Your Environment

Operating System and version (e.g. Linux, Windows, MacOS): Ubuntu 22.04.4 LTS with K3s version 1.30.0+k3s1
How are you running Tinkerbell? Using Vagrant & VirtualBox, Vagrant & Libvirt, on Packet using Terraform, or give details: KVM
Link to your project or a code example to reproduce issue: K3s is deployed with default settings

rgruyters commented 4 months ago

It looks like this issue is only with UDP traffic, not TCP.

jacobweinstock commented 4 months ago

Hey @rgruyters, thanks for posting this. Mind clarifying a bit. What are the IPs involved? Also, I'm not understanding the affect this is having. Mind expanding on Cannot use Tinkerbell service?

IP	Description
10.128.112.161	dhcp client?
10.128.161.133	?
10.128.161.132	?

rgruyters commented 2 months ago

Sure!

IP	Description
10.128.112.161	DHCP client
10.128.161.133	Secondary IP for LoadBalancer to use with Tinkerbell
10.128.161.132	Host IP (where Kubernetes is running

jacobweinstock commented 2 months ago

It is normal Kubernetes behavior for traffic originating from within a pod to be sent out via the Host's IP. As DHCP traffic is UDP and connectionless, all DHCP packets sent by Smee can be classified as originating from within the Smee pod. Furthermore, Kube-vip doesn't create routing rules. If you look at the interface that has the IP configured by kube-vip you'll see that it creates the IP with a /32. This means this IP will not be used for routing when the host's routing table is used.

Is this traffic pattern causing issues of some kind?

rgruyters commented 2 months ago

yeah, when DHCP traffic is passed through a relay address, in this case 10.128.112.161, it won't work, because reply traffic comes from a different IP,10.128.161.132, rather than the expected 10.128.161.133.

jacobweinstock commented 2 months ago

yeah, when DHCP traffic is passed through a relay address, in this case 10.128.112.161, it won't work, because reply traffic comes from a different IP,10.128.161.132, rather than the expected 10.128.161.133.

Hey @rgruyters. what do you mean by, "it won't work"? what exactly isn't working? Is there a DHCP relay in use in your environment?

rgruyters commented 2 months ago

Yes, we use DHCP relais to pass DHCP requests to our Tinkerbell service.

It won't work, as in the replied traffic from 10.128.161.132 will not be accepted by the relay process, because the initial traffic was sent to .133.

jacobweinstock commented 2 months ago

Mind sharing more info about the dhcp relay you're using? I'm not familiar with this kind of IP filtering. Also, have you tried deploying the stack with stack.relay.presentGiaddrAction: forward?

rgruyters commented 1 month ago

Sorry for the late response, we use Cumulus switches with DHCP relay on it. I think they use ISC DHCP service.

Also, have you tried deploying the stack with stack.relay.presentGiaddrAction: forward?

No I haven't. Will look into it. Thanks!

jacobweinstock commented 1 month ago

Hey @rgruyters , thanks for sharing some details on your switches. I see you closed the issue. Was this on purpose? Maybe you were able to resolve the issue?

rgruyters commented 1 month ago

I have closed it, because the option to set stack.relay.presentGiaddrAction: forward would work for us. (for dhcrelay would be -m forward option)

jacobweinstock commented 1 month ago

Thanks for the update. Glad to hear that works.

tinkerbell / charts