Open milosmns opened 2 years ago
As stated in this comment:
Klipper-lb does not change source IP. it supports
externalTrafficPolicy: Local
.
I guess you're using k3s as Kubernetes distribution and probably Traefik as your cluster router (default router for k3s).
If you're using Traefik, here's how you get the original client IP address (for other routers, the settings may be a bit different, but the same logic still applies):
Set externalTrafficPolicy
to Local
. If you use the Traefik helm chart you can set the values to:
service:
spec:
externalTrafficPolicy: Local
If you have multiple nodes make sure that your router is running on the node where you send the traffic to. Let's say you have the domain example.com
and it points to your cluster node with the IP 123.4.5.67 (e.g. via a DNS A record). Then you only have to make sure that your router (the Traefik instance) is running on this node. In the Traefik helm chart you can achieve that with the nodeAffinity
config, for example:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 10
preference:
matchExpressions:
- key: kubernetes.io/hostname
operator: In
values: [ "node2" ]
You could even have multiple nodes in your nodeAffinity
list with different weights (the higher the weight the more likely it will be deployed on that node), e.g.:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 20
preference:
matchExpressions:
- key: kubernetes.io/hostname
operator: In
values: [ "node1" ]
- weight: 10
preference:
matchExpressions:
- key: kubernetes.io/hostname
operator: In
values: [ "node2" ]
Just replace the node name in the values array and adjust the weight to your needs.
To get the kubernetes.io/hostname
label for each of your nodes you can run this command (in most cases the node name and kubernetes.io/hostname
label are identical):
kubectl get nodes -o custom-columns="NAME:.metadata.name,LABEL (kubernetes.io/hostname):{.metadata.labels.kubernetes\.io/hostname}"
For more details have a look at this article: K3S Thing: Make Traefik Forward Real Client IP
The only problem with that article is that it only offers a DaemonSet
as solution (instead of the default deployment of kind Deployment
) which prevents the user from using Traefik to generate SSL certificates (acme certificate resolvers in Traefik are only available in Deployment
mode).
As per the guide's 3rd step, I disabled Traefik and was using Nginx Ingress Controller
Ah sorry, I didn't read that guide. But you can still use my answer to fix your problem. Just make sure that externalTrafficPolicy
is set to Local
(as documented here for nginx) and your nodeAffinity is set as described in my comment above (here's how you set the affinity in the nginx helm chart).
I am running one node. Set To Local
still getting svclb
ip. I think it's a bug?
@Taymindis Where and how did you set the externalTrafficPolicy to Local? Because if you do it at runtime you have to restart Traefik. And how do you check it? With the whoami container from containous?
@Taymindis Where and how did you set the externalTrafficPolicy to Local? Because if you do it at runtime you have to restart Traefik. And how do you check it? With the whoami container from containous?
Hi @mamiu , I am running bare-metal k3s without traefik
on ubuntu vm.
My Steps of procedure
Local
service:
enabled: true
# -- If enabled is adding an appProtocol option for Kubernetes service. An appProtocol field replacing annotations that were
# using for setting a backend protocol. Here is an example for AWS: service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
# It allows choosing the protocol for each backend specified in the Kubernetes service.
# See the following GitHub issue for more details about the purpose: https://github.com/kubernetes/kubernetes/issues/40244
# Will be ignored for Kubernetes versions older than 1.20
##
appProtocol: true
annotations: {}
labels: {}
# clusterIP: ""
# -- List of IP addresses at which the controller services are available
## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
##
externalIPs: []
# loadBalancerIP: ""
loadBalancerSourceRanges: []
enableHttp: true
enableHttps: true
## Set external traffic policy to: "Local" to preserve source IP on providers supporting it.
## Ref: https://kubernetes.io/docs/tutorials/services/source-ip/#source-ip-for-services-with-typeloadbalancer
externalTrafficPolicy: "Local"
And I have a app pod which echo back the client IP back when we service a specific url.
Please note that I am not using traefik
, I am using kubernetes/ingress-nginx
Klipper-lb does not change source IP. it supports
externalTrafficPolicy: Local
.
Sorry, but that is simply not true: entry:
iptables -t nat -I POSTROUTING -d ${dest_ip}/32 -p ${DEST_PROTO} -j MASQUERADE
This configures iptables to do exactly that.
The whole thing about getting the router to run on the same node as the end service indeed resolves this, but it defies loadbalancing quite a bit...
The whole thing about getting the router to run on the same node as the end service indeed resolves this, but it defies loadbalancing quite a bit...
@jeroenrnl I 100% agree with that! But haven't found a good alternative solution yet.
And I have a app pod which echo back the client IP back when we service a specific url.
@Taymindis Your router (in your case nginx) will get the correct client IP address but then has to translate it so that your app pod (which echoes the client IP address) is sending the traffic back to the router. So your router can't send the request with your client IP address otherwise your app will try to respond to the client IP directly without taking the extra step through the router and that's not how networking works (for more details watch this YouTube video). To solve this issue, load balancers, routers, reverse proxies, etc. use a special HTTP header called X-Forwarded-For
. Most routers (including ingress-nginx, see here) support this HTTP header and many applications treat the value passed to that header as the client IP address (or at least have the option to enable it).
The PROXY protocol as defined on haproxy.org provides a solution to this issue. It allows to maintain proper load balancing while maintaining access to the client IP, and isn't specific to HTTP.
It requires that Klipper LB and the backend of the service be compatible with that protocol. Traefik and nginx already support it, amongst many others.
Maybe this could get implemented in Klipper LB under a togglable flag? Although, it would require a much more complex setup than the current iptable rules which currently do not allow to add the proper headers.
@mamiu is there way to achieve your solution in a HA (high availability k3s setup) ? im talking about this - https://rancher.com/docs/k3s/latest/en/installation/ha-embedded/
would it mean that the masters have traefik running on them with externalTrafficPolicy: Local ? unsure on how to achieve this then
@sandys Yes it's definitely possible. But it only works if the Traefik instance runs on the node where you send the traffic to. As explained in my comment up here.
@sandys @mamiu
I have been looking for 3 weeks. I observed the behavior described by mamiu.
the ip client is preserved if we arrive on the klipper which is on the same node as the traefik service.
This is a major problem for a loadbalancer especially with the default helm configuration. It creates an instance of the traefik service and as many klipper services as there are nodes.
Perhaps a solution would be to have a traefik service on each node and each klipper instance pointing to the nearby traefik pod.
However, this is beyond my capabilities at the moment. I am not comfortable with helm
Anyone still dealing with this?
I am, but am at my wits end. Running a fairly simple, well-documented set-up (https://github.com/MikaelElkiaer/flux-twr). It is rootless. I cannot get the proper client IP in my Traefik requests.
As stated in this comment:
Klipper-lb does not change source IP. it supports
externalTrafficPolicy: Local
.I guess you're using k3s as Kubernetes distribution and probably Traefik as your cluster router (default router for k3s).
If you're using Traefik, here's how you get the original client IP address (for other routers, the settings may be a bit different, but the same logic still applies):
1. Set `externalTrafficPolicy` to `Local`. If you use the [Traefik helm chart](https://github.com/traefik/traefik-helm-chart/tree/master/traefik) you can set the values to: ```yaml service: spec: externalTrafficPolicy: Local ``` 2. If you have multiple nodes make sure that your router is running on the node where you send the traffic to. Let's say you have the domain `example.com` and it points to your cluster node with the IP 123.4.5.67 (e.g. via a DNS A record). Then you only have to make sure that your router (the Traefik instance) is running on this node. In the Traefik helm chart you can achieve that with the `nodeAffinity` config, for example: ```yaml affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 10 preference: matchExpressions: - key: kubernetes.io/hostname operator: In values: [ "node2" ] ``` You could even have multiple nodes in your `nodeAffinity` list with different weights (the higher the weight the more likely it will be deployed on that node), e.g.: ```yaml affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 20 preference: matchExpressions: - key: kubernetes.io/hostname operator: In values: [ "node1" ] - weight: 10 preference: matchExpressions: - key: kubernetes.io/hostname operator: In values: [ "node2" ] ``` Just replace the node name in the values array and adjust the weight to your needs. To get the `kubernetes.io/hostname` label for each of your nodes you can run this command (in most cases the node name and `kubernetes.io/hostname` label are identical): ```shell kubectl get nodes -o custom-columns="NAME:.metadata.name,LABEL (kubernetes.io/hostname):{.metadata.labels.kubernetes\.io/hostname}" ```
For more details have a look at this article: K3S Thing: Make Traefik Forward Real Client IP The only problem with that article is that it only offers a
DaemonSet
as solution (instead of the default deployment of kindDeployment
) which prevents the user from using Traefik to generate SSL certificates (acme certificate resolvers in Traefik are only available inDeployment
mode).
I am unable to replicate this behavior on my multi-node k3s cluster. I have setup Traefik's affinity to the correct node, and can confirm that it scheduled on the right node, and have externalTrafficPolicy
set correctly, but the Traefik access logs dont show the real IP.
I also have Pi hole DNS running on a LoadBalancer
on port 53, and can confirm that it is also unable to see the real IP when getting requests. Even if it did work, having to schedule all of my pods on a single node for the sake of proper logging feels like it defeats the purpose of having a Load Balancer in the first place.
Is there a way to get this working? It's entirely possible I am missing something here.
@dakota-marshall I'd recommend you not to use klipper (the default load balancer of K3s), but instead have a Traefik instance running on each node that listens to external traffic. I know this will prevent you from using Traefik's Let's Encrypt integration, but if you want that you can just use Switchboard.
@mamiu Thanks for the info! In that case ill look at switching over to doing that and using metallb for my other services that need a LoadBalancer.
I have same issue,
I’m using k3s
, I have disabled traefik
(but not service-lb
) and using NGINX Ingress Controller
with default service-lb
(klipper-lb
).
"controller": {
"kind": "DaemonSet",
"allowSnippetAnnotations": True,
"service": {
"externalTrafficPolicy": "Local",
},
"config": {
"enable-real-ip": True,
"use-forwarded-headers": True,
"compute-full-forwarded-for": True,
"use-proxy-protocol": True,
"proxy-add-original-uri-header": True,
"forwarded-for-header": "proxy_protocol",
"real-ip-header": "proxy_protocol",
},
},
kubectl logs -n apps-dev ingress-nginx-controller
" while reading PROXY protocol, client: 10.42.0.10, server: 0.0.0.0:80
2024/01/09 19:28:10 [error] 98#98: *2465 broken header: "GET / HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/119.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Accept-Language: en-CA,en-US;q=0.7,en;q=0.3
Accept-Encoding: gzip, deflate
DNT: 1
Connection: keep-alive
Upgrade-Insecure-Requests: 1
" while reading PROXY protocol, client: 10.42.0.10, server: 0.0.0.0:80
kubectl describe services -n apps-dev ingress-nginx-dev-c34ab985-controlle
Name: ingress-nginx-dev-c34ab985-controller
Namespace: apps-dev
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx-dev-c34ab985
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.9.4
helm.sh/chart=ingress-nginx-4.8.3
Annotations: meta.helm.sh/release-name: ingress-nginx-dev-c34ab985
meta.helm.sh/release-namespace: apps-dev
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx-dev-c34ab985,app.kubernetes.io/name=ingress-nginx
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.43.172.22
IPs: 10.43.172.22
LoadBalancer Ingress: 100.100.100.100
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 30861/TCP
Endpoints: 10.42.0.11:80
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 31713/TCP
Endpoints: 10.42.0.11:443
Session Affinity: None
External Traffic Policy: Local
HealthCheck NodePort: 30448
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 113s service-controller Ensuring load balancer
Normal AppliedDaemonSet 113s Applied LoadBalancer DaemonSet kube-system/svclb-ingress-nginx-dev-c34ab985-controller-8873439e
Normal UpdatedLoadBalancer 83s Updated LoadBalancer with new IPs: [] -> [100.100.100.100]
Name: ingress-nginx-dev-c34ab985-controller-admission
Namespace: apps-dev
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx-dev-c34ab985
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.9.4
helm.sh/chart=ingress-nginx-4.8.3
Annotations: meta.helm.sh/release-name: ingress-nginx-dev-c34ab985
meta.helm.sh/release-namespace: apps-dev
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx-dev-c34ab985,app.kubernetes.io/name=ingress-nginx
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.43.38.162
IPs: 10.43.38.162
Port: https-webhook 443/TCP
TargetPort: webhook/TCP
Endpoints: 10.42.0.11:8443
Session Affinity: None
Events: <none>
kubectl describe pods -n kube-system svclb-ingress-nginx-dev-c34ab985-controller-8873439e-xv9tx
Name: svclb-ingress-nginx-dev-c34ab985-controller-8873439e-xv9tx
Namespace: kube-system
Priority: 0
Service Account: svclb
Node: ip-10-10-1-110/10.10.1.110
Start Time: Tue, 09 Jan 2024 11:22:59 -0800
Labels: app=svclb-ingress-nginx-dev-c34ab985-controller-8873439e
controller-revision-hash=78c594c45
pod-template-generation=1
svccontroller.k3s.cattle.io/svcname=ingress-nginx-dev-c34ab985-controller
svccontroller.k3s.cattle.io/svcnamespace=apps-dev
Annotations: <none>
Status: Running
IP: 10.42.0.10
IPs:
IP: 10.42.0.10
Controlled By: DaemonSet/svclb-ingress-nginx-dev-c34ab985-controller-8873439e
Containers:
lb-tcp-80:
Container ID: containerd://a4aaa9c9e86a1bd738d3fda1615953d95d3c269b3059eab568b2a9a236dca0a3
Image: rancher/klipper-lb:v0.4.4
Image ID: docker.io/rancher/klipper-lb@sha256:d6780e97ac25454b56f88410b236d52572518040f11d0db5c6baaac0d2fcf860
Port: 80/TCP
Host Port: 80/TCP
State: Running
Started: Tue, 09 Jan 2024 11:23:03 -0800
Ready: True
Restart Count: 0
Environment:
SRC_PORT: 80
SRC_RANGES: 0.0.0.0/0
DEST_PROTO: TCP
DEST_PORT: 30861
DEST_IPS: (v1:status.hostIP)
Mounts: <none>
lb-tcp-443:
Container ID: containerd://0ff2179299dac6def1471b680ae8c37ed352c94c0c5c5afccf4aee69c1c89f0b
Image: rancher/klipper-lb:v0.4.4
Image ID: docker.io/rancher/klipper-lb@sha256:d6780e97ac25454b56f88410b236d52572518040f11d0db5c6baaac0d2fcf860
Port: 443/TCP
Host Port: 443/TCP
State: Running
Started: Tue, 09 Jan 2024 11:23:04 -0800
Ready: True
Restart Count: 0
Environment:
SRC_PORT: 443
SRC_RANGES: 0.0.0.0/0
DEST_PROTO: TCP
DEST_PORT: 31713
DEST_IPS: (v1:status.hostIP)
Mounts: <none>
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes: <none>
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8m53s default-scheduler Successfully assigned kube-system/svclb-ingress-nginx-dev-c34ab985-controller-8873439e-xv9tx to ip-10-10-1-110
Normal Pulling 8m52s kubelet Pulling image "rancher/klipper-lb:v0.4.4"
Normal Pulled 8m49s kubelet Successfully pulled image "rancher/klipper-lb:v0.4.4" in 3.335s (3.335s including waiting)
Normal Created 8m49s kubelet Created container lb-tcp-80
Normal Started 8m49s kubelet Started container lb-tcp-80
Normal Pulled 8m49s kubelet Container image "rancher/klipper-lb:v0.4.4" already present on machine
Normal Created 8m48s kubelet Created container lb-tcp-443
Normal Started 8m48s kubelet Started container lb-tcp-443
kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
apps-dev ingress-nginx-dev-c34ab985-controller-zh48d 1/1 Running 0 36m 10.42.0.11 ip-10-10-1-110 <none> <none>
kube-system coredns-6799fbcd5-6r48p 1/1 Running 0 37m 10.42.0.2 ip-10-10-1-110 <none> <none>
kube-system metrics-server-67c658944b-mrdwm 1/1 Running 0 37m 10.42.0.3 ip-10-10-1-110 <none> <none>
kube-system svclb-ingress-nginx-dev-c34ab985-controller-8873439e-xv9tx 2/2 Running 0 36m 10.42.0.10 ip-10-10-1-110 <none> <none>
Useful notes.
I am also running into this. Sad to see that this is still unaddressed after this long time, given it is such an elemental feature of a LB.
I am also running into this. Sad to see that this is still unaddressed after this long time, given it is such an elemental feature of a LB.
Would this help? https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
I am also running into this. Sad to see that this is still unaddressed after this long time, given it is such an elemental feature of a LB.
Would this help? https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
I have the same issue on my single node k3s. On my machine, svclb-traefik pod uses its own IP address before sending package to traefik, thus x-forward-for is always filled with the IP of svclb-traefik pod. I found a possible reason in k3s document:
When the ServiceLB Pod runs on a node that has an external IP configured, the node's external IP is populated into the Service's status.loadBalancer.ingress address list with ipMode: VIP. Otherwise, the node's internal IP is used.
So it seems like it's impossible to manually set the ipMode to 'Proxy'?
Hey, to avoid copy-pasting the same question, here's the StackOverflow link.
Basically I want my pods to get the original client IP address... or at least have
X-Forwarded-For
header, in a worse-case scenario. I used this guide to set up my cluster.As I said there, happy to share more details to get this sorted out.