Unable to update ep to external LoxiLB correctly

6547709 commented 2 months ago

Problem description: When I reduced the number of replicas of the Deployment, kube-loxilb did not correctly clean up the EPs in the external LoxiLB. For example, after the number of replicas changed from 4 to 2, K8S had only 2 EPs, and Kube-Loxilb also updated the external LoxiLB (but did not delete the non-existent EPs), resulting in 2 EPs in the "nok" state in the external LoxiLB. K8S-Service:

$ kubectl get svc
NAME                         TYPE           CLUSTER-IP       EXTERNAL-IP     PORT(S)                                                           AGE
whoami-service               LoadBalancer   10.97.207.168    llb-10.60.4.1   80:32367/TCP                                                      11d

K8S-Service-EP:

$ kubectl get ep
NAME                         ENDPOINTS                                                            AGE
whoami-service               10.244.1.223:80,10.244.5.77:80                                       11d

K8S-kube-loxilb-logs:

I0927 01:15:29.511810       1 loadbalancer.go:1324] default/whoami-service: Endpoint update
I0927 01:15:29.511849       1 loadbalancer.go:869] default/whoami-service: Added(true) Update(true) needDelete(false)
I0927 01:15:29.511866       1 loadbalancer.go:870] Endpoint IP Pairs [10.244.5.77 10.244.1.223]
I0927 01:15:29.511885       1 loadbalancer.go:871] Secondary IP Pairs []
I0927 01:15:29.513367       1 loadbalancer.go:1098] loxilb-lb(10.40.45.8): add lb {{10.60.4.1  80 tcp 1 1 false true 1800 0 true  0   3 10 0 default_whoami-service 0 } [] [{10.244.5.77 80 1  } {10.244.1.223 80 1  }]}
I0927 01:15:29.513465       1 loadbalancer.go:1098] loxilb-lb(10.40.45.9): add lb {{10.60.4.1  80 tcp 1 1 false true 1800 0 true  0   3 10 0 default_whoami-service 0 } [] [{10.244.5.77 80 1  } {10.244.1.223 80 1  }]}

External-Loxilb-lb:

core@loxilb-01 ~ $ docker exec -it loxilb loxicmd get lb
|  EXT IP   | PORT | PROTO |                  NAME                  | MARK | SEL  |  MODE   | # OF ENDPOINTS | MONITOR |
|-----------|------|-------|----------------------------------------|------|------|---------|----------------|---------|
| 10.60.4.1 |   80 | tcp   | default_whoami-service                 |    0 | hash | onearm  |              2 | On      |

External-Loxilb-ep:

core@loxilb-01 ~ $ docker exec -it loxilb loxicmd get ep
|     HOST     |         NAME          | PTYPE | PORT | DURATION | RETRIES | MINDELAY  | AVGDELAY  | MAXDELAY  | STATE |
|--------------|-----------------------|-------|------|----------|---------|-----------|-----------|-----------|-------|
| 10.244.1.223 | 10.244.1.223_tcp_80   | tcp:  |   80 |       10 |       3 |           |           |           | ok    |
| 10.244.3.135 | 10.244.3.135_tcp_80   | tcp:  |   80 |       10 |       3 |           |           |           | nok   |
| 10.244.5.77  | 10.244.5.77_tcp_80    | tcp:  |   80 |       10 |       3 |           |           |           | ok    |
| 10.244.6.182 | 10.244.6.182_tcp_80   | tcp:  |   80 |       10 |       3 |           |           |           | nok   |

Kube-Loxilb-args:

        args:
        - --loxiURL=http://10.40.45.8:11111,http://10.40.45.9:11111
        - --externalCIDR=10.60.4.0/24
        - --setLBMode=2
        - --setUniqueIP=true

Whoami-Service:

apiVersion: v1
kind: Service
metadata:
  name: whoami-service
  annotations:
    loxilb.io/lbmode: "onearm"
    loxilb.io/epselect: "hash"
    loxilb.io/usepodnetwork: "yes"
    loxilb.io/liveness: "yes"
    loxilb.io/probetimeout: "10"
    loxilb.io/proberetries: "3"
  labels:
    app: whoami
spec:
  externalTrafficPolicy: Local
  loadBalancerClass: loxilb.io/loxilb
  type: LoadBalancer
  selector:
    app: whoami
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  ports:
    - name: tcp-80
      protocol: TCP
      port: 80
      targetPort: 80

TrekkieCoder commented 2 months ago

Your observation is correct. Currently, the end-points are kept in an inactive state till the life of LB rule (in case they can become alive again). Once the LB service is deleted the end-points should be cleaned up as well.

Having said that, the original intention was to just keep the "non-existing" endpoints in an inactive state and not perform any probe related activities with it. Will double confirm if this is still the case or not.

TrekkieCoder commented 2 months ago

Fixes were made to loxilb to address this. Request to check with latest loxilb docker image.

6547709 commented 2 months ago

Thank you for your reply. I can understand the original design intention.

After the Pod is recreate/deleted in K8S, the source EP will no longer appear. As Kube-LoxiLB, Only need to ensure that the EP in K8S is consistent with the EP in LoxiLB.
When used in other environments (non-K8S), it becomes necessary to keep this EP because the EP may be online again.

6547709 commented 2 months ago

I just tested it and it works as expected, automatically cleans up the ep.

loxilb-io / kube-loxilb

Unable to update ep to external LoxiLB correctly #184