aledbf / kube-keepalived-vip

Kubernetes Virtual IP address/es using keepalived
Apache License 2.0
188 stars 75 forks source link

Able to deploy in kube-system and `sync()` properly #102

Closed panpan0000 closed 5 years ago

panpan0000 commented 5 years ago

When deploying the keepalived-kip to "kube-system" (in most case, it's reasonable to protect the "system" application and hide them in 'kube-system'), the "sync()" will be triggered every second, which causes keepalived keeping reload. It's because the endpoints of kube-controller-manager & kube-scheduler being re-new every second, you can try with kubectl get ep -n kube-system -w . I'm not sure about the reason ,this happens both to single master or 3 master clusters, my kubernetes version is 1.10, I only found sth relevant : https://github.com/kubernetes/kubernetes/issues/23812 )

So in our code, Everytime , pkg/controller/main.go, below Enqueue() will be triggered

if !reflect.DeepEqual(old, cur) {
   ipvsc.syncQueue.Enqueue(cur)
}

because every second, the kube-controller-manager & kube-scheduler endpoints, their ResourceVersion and renewTime will update. so DeepEqual() always being false.

How about adding a check to UpdateFunc() as below ?

if  old.(*apiv1.Endpoints).Namespace == "kube-system" {
    return;
}

assuming there's no use-case keepalived managers workload under kube-system.

or

if  old.(*apiv1.Endpoints).Name == "kube-scheduler" || old.(*apiv1.Endpoints).Name == "kube-controller-manager"   {
    return;
}
aledbf commented 5 years ago

How about adding a check to UpdateFunc() as below ?

No. This is not a fix.

What you describe not makes sense. Please provide the steps and configuration of the configmap to reproduce this issue

panpan0000 commented 5 years ago

Thanks @aledbf

After digging, my observation was not right (hmm... partial right :-P ), it will not keep doing Reload() , but sync().

Every second the endpoint of kube-controller-manager & kube-scheduler gets updated , the ipvsControllerController.sync() will be triggered. but the final guard (the Cfg md5 check in sync() ) save the world, as below:

md5, err := checksum(keepalivedCfg)
if err == nil && md5 == ipvsc.ruMD5 {
    return nil
}

But still curious, the two weird ep updates are quite annoying. And it will trigger sync() every second which is unnecessary. So don't you consider block those two k8s endpoints update event as my original thread :-) ? If really not , close this issue , please

attached my yaml (major part) I use deployment instead of daemonset, and run it in kube-system.

apiVersion: v1
kind: ConfigMap
metadata:
  name: vip-configmap
  namespace: kube-system
data:
  10.6.0.50: default/nginx
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube-keepalived-vip
  namespace: kube-system
secrets:
- name: kube-keepalived-vip-token-4kzkh
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: kube-keepalived-vip
rules:
  - apiGroups: [""]
    resources:
    - pods
    - nodes
    - endpoints
    - services
    - configmaps
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kube-keepalived-vip
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kube-keepalived-vip
subjects:
  - kind: ServiceAccount
    name: kube-keepalived-vip
    namespace: kube-system
---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kube-keepalived-vip
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      name: kube-keepalived-vip
  template:
    metadata:
      labels:
        name: kube-keepalived-vip
    spec:
      hostNetwork: true
      serviceAccount: kube-keepalived-vip
      containers:
        - image: aledbf/kube-keepalived-vip:0.35
          name: kube-keepalived-vip
          livenessProbe:
            httpGet:
              path: /health
              port: 8081
            initialDelaySeconds: 15
            timeoutSeconds: 3
          resources:
             limits:
               cpu: 500m
               memory: 500Mi
             requests:
               cpu: 100m
               memory: 500Mi
          securityContext:
            privileged: true
          volumeMounts:
            - mountPath: /lib/modules
              name: modules
              readOnly: true
            - mountPath: /dev
              name: dev
          # use downward API
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          # to use unicast
          args:
            - -v=5
            - --services-configmap=kube-system/vip-configmap
      volumes:
        - name: modules
          hostPath:
            path: /lib/modules
        - name: dev
          hostPath:
            path: /dev
aledbf commented 5 years ago

Every second the endpoint of kube-controller-manager & kube-scheduler gets updated , the ipvsControllerController.sync() will be triggered.

This is expected. Watching endpoints is one of the more expensive things you can do with an informer. You can also see the same behavior if there is a pod in CrashloopBackOff. That's why adding exceptions is not the solution.

but the final guard (the Cfg md5 check in sync() ) save the world, as below:

That's why is there :)