MoJo2600 / pihole-kubernetes

PiHole on kubernetes
498 stars 173 forks source link

metallb not assigning the ip to serviceDns #248

Closed auxworker closed 1 year ago

auxworker commented 1 year ago

Cross posting from this thread: https://github.com/metallb/metallb/issues/1431#issuecomment-1366914100

I'm able to reproduce this issue very consistently, reinstalling the entire stack always lands me with the same problem. I am able to expose serviceWeb without any issues, however serviceDns is not able to get the correct IP. Exposing some of the services is a no go because of metallb, metalllb eventually tries to assign the IPs of the k3s worker nodes (10.0.0.4 & 10.0.0.5 )to the service even though that is not what i'm asking it it do, it's supposed to get 10.0.0.15 here is the error I get from metallb:

{"caller":"service.go:74","event":"clearAssignment","level":"error","msg":"Failed to retrieve lbIPs family","reason":"nolbIPsIPFamily","ts":"2022-12-28T20:58:14Z"}

k3s: v1.24.8+k3s1 (2 node cluster) metallb: v0.13.7

kubectl describe svc --namespace argocd helm-chart-pihole-dns-udp 
Name:                     helm-chart-pihole-dns-udp
Namespace:                argocd
Labels:                   app=pihole
                          app.kubernetes.io/instance=helm-chart-pihole
                          chart=pihole-2.11.0
                          heritage=Helm
                          release=helm-chart-pihole
Annotations:              metallb.universe.tf/address-pool: mypool1
                          metallb.universe.tf/allow-shared-ip: pihole-svc
Selector:                 app=pihole,release=helm-chart-pihole
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.43.248.246
IPs:                      10.43.248.246
IP:                       10.0.0.15
LoadBalancer Ingress:     10.0.0.4, 10.0.0.5
Port:                     dns-udp  53/UDP
TargetPort:               dns-udp/UDP
NodePort:                 dns-udp  32541/UDP
Endpoints:                10.42.0.40:53
Session Affinity:         None
External Traffic Policy:  Local
HealthCheck NodePort:     30860
Events:
  Type     Reason                Age                From                Message
  ----     ------                ----               ----                -------
  Normal   EnsuringLoadBalancer  21s                service-controller  Ensuring load balancer
  Normal   AppliedDaemonSet      21s                service-controller  Applied LoadBalancer DaemonSet kube-system/svclb-helm-chart-pihole-dns-udp-46dd889c
  Normal   UpdatedLoadBalancer   19s                service-controller  Updated LoadBalancer with new IPs: [10.0.0.15] -> [10.0.0.5]
  Normal   IPAllocated           19s (x3 over 21s)  metallb-controller  Assigned IP ["10.0.0.15"]
  Normal   nodeAssigned          19s (x2 over 21s)  metallb-speaker     announcing from node "master.infrastructure.router.local" with protocol "layer2"
  Warning  nolbIPsIPFamily       19s                metallb-controller  Failed to retrieve LBIPs IPFamily for ["10.0.0.4" "10.0.0.5"]: IPFamilyForAddresses: same address family ["10.0.0.4" "10.0.0.5"]
  Normal   UpdatedLoadBalancer   19s                service-controller  Updated LoadBalancer with new IPs: [10.0.0.15] -> [10.0.0.4 10.0.0.5]
MoJo2600 commented 1 year ago

I think the issue is with the annotation metallb.universe.tf/allow-shared-ip: pihole-svc - it means that all services with this annotation should use the same LoadBalancer and the same IP address. Do all the other pihole services have the same IP address 10.0.0.15 assigned?

auxworker commented 1 year ago

I think the issue is with the annotation metallb.universe.tf/allow-shared-ip: pihole-svc - it means that all services with this annotation should use the same LoadBalancer and the same IP address. Do all the other pihole services have the same IP address 10.0.0.15 assigned?

I assigned 10.0.0.15 only to serviceDns as I am trying to isolate this issue. I originally followed the example followed on the charts/pihole page, using the same ip for both serviceDns & serviceWeb but it made no difference at all. I keep on getting the same error from metallb

        dnsmasq:
          customDnsEntries:
            - address=/argocd/10.0.0.4

          customCnameEntries:
            - cname=argocd.infrastructure.router.local,10.0.0.4

        persistentVolumeClaim:
          enabled: true

        serviceDns:
          loadBalancerIP: 10.0.0.15
          annotations:
            metallb.universe.tf/allow-shared-ip: pihole-svc
            metallb.universe.tf/address-pool: mypool1
          type: LoadBalancer

        serviceWeb:
          loadBalancerIP: 10.0.0.19
          annotations:
            metallb.universe.tf/address-pool: mypool1
          type: LoadBalancer

helm-chart-pihole-dns-udp and helm-chart-pihole-dns-tcp are both having this issue, but serviceWebi is totally fine:

kubectl describe --namespace argocd svc helm-chart-pihole-web 
Name:                     helm-chart-pihole-web
Namespace:                argocd
Labels:                   app=pihole
                          app.kubernetes.io/instance=helm-chart-pihole
                          chart=pihole-2.11.0
                          heritage=Helm
                          release=helm-chart-pihole
Annotations:              metallb.universe.tf/address-pool: mypool1
                          metallb.universe.tf/allow-shared-ip: pihole-svc
Selector:                 app=pihole,release=helm-chart-pihole
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.43.243.196
IPs:                      10.43.243.196
IP:                       10.0.0.19
LoadBalancer Ingress:     10.0.0.19
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  30107/TCP
Endpoints:                10.42.1.48:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  32063/TCP
Endpoints:                10.42.1.48:443
Session Affinity:         None
External Traffic Policy:  Local
HealthCheck NodePort:     30929
Events:
  Type    Reason                Age    From                Message
  ----    ------                ----   ----                -------
  Normal  IPAllocated           4m44s  metallb-controller  Assigned IP ["10.0.0.19"]
  Normal  EnsuringLoadBalancer  4m44s  service-controller  Ensuring load balancer
  Normal  AppliedDaemonSet      4m44s  service-controller  Applied LoadBalancer DaemonSet kube-system/svclb-helm
-chart-pihole-web-cb4e07a1
  Normal  nodeAssigned          4m6s   metallb-speaker     announcing from node "worker.infrastructure.router.lo
cal-5c260daf" with protocol "layer2"
auxworker commented 1 year ago

@MoJo2600 I'm looking at https://github.com/MoJo2600/pihole-kubernetes/blob/master/charts/pihole/templates/service-dns-udp.yaml

Around line 56 and after, can you explain if this is only executed if the system has IPV6? What happens if I've disabled IPV6 on my host machines and I only have IPV4?

auxworker commented 1 year ago

After applying this specific commit of metallb https://raw.githubusercontent.com/metallb/metallb/fd4ca5ed91d184e3bdf74797552473fa8994bf1f/config/manifests/metallb-native.yaml I was able to get it all working and IPs assigned as expected.

Here are the details of the fix that addressed the actual issue: https://github.com/metallb/metallb/commit/ef822134600bf9ce2a0365cc9a0e7cc5752c8269

Hope this helps out the next person who runs into this

MoJo2600 commented 1 year ago

I'm glad that you figured this out! And that it is a metallb issue. Maybe it would make sense to add this issue to a "Known Issues" discussion or something similar.