[RRR] random round robin

kuritka commented 2 years ago

there is difference between those three:

# RRR
  strategy:
    type: roundRobin
    splitBrainThresholdSeconds: 300
    dnsTtlSeconds: 30

# WRR
  strategy:
    type: roundRobin
    splitBrainThresholdSeconds: 300
    dnsTtlSeconds: 30
    weight:
      eu: 1
      us: 1

# SKIP
  strategy:
    type: roundRobin
    splitBrainThresholdSeconds: 300
    dnsTtlSeconds: 30
    weight:
      eu: 1

The difference is that first one provides standard coreDNS loadbalance round_robin(RRR) while second one enables WRR. The third one is incompleted.

We cannot easily switch balancing in corefile to enable RRR and disable WRR when GSLB configuration has change.

This PR contains functionality that switches between WRR, RRR and doing nothing (SKIP - returns list of IPs without shuffling) .

I'm reusing CoreDNS loadbalance to perform RRR.

k8gb-coredns-5454ddd5b7-r555w coredns [INFO] plugin/wrr: [random][172.18.0.3 172.18.0.4 172.18.0.6 172.18.0.5]
k8gb-coredns-5454ddd5b7-r555w coredns [INFO] plugin/wrr: [random][172.18.0.6 172.18.0.3 172.18.0.4 172.18.0.5]
k8gb-coredns-5454ddd5b7-r555w coredns [INFO] plugin/wrr: [random][172.18.0.4 172.18.0.3 172.18.0.5 172.18.0.6]
k8gb-coredns-5454ddd5b7-r555w coredns [INFO] plugin/wrr: [1 0][us1 eu1]: [172.18.0.5 172.18.0.6 172.18.0.3 172.18.0.4]
k8gb-coredns-5454ddd5b7-r555w coredns [INFO] plugin/wrr: [0 1][eu1 us1]: [172.18.0.3 172.18.0.4 172.18.0.5 172.18.0.6]
k8gb-coredns-5454ddd5b7-r555w coredns [INFO] plugin/wrr: [1 0][us1 eu1]: [172.18.0.5 172.18.0.6 172.18.0.4 172.18.0.3]
k8gb-coredns-79458796bb-mq855 coredns [INFO] plugin/wrr: [skipping][172.18.0.5 172.18.0.6 172.18.0.3 172.18.0.4]
k8gb-coredns-79458796bb-mq855 coredns [INFO] plugin/wrr: [skipping][172.18.0.5 172.18.0.6 172.18.0.3 172.18.0.4]
k8gb-coredns-79458796bb-mq855 coredns [INFO] plugin/wrr: [skipping][172.18.0.5 172.18.0.6 172.18.0.3 172.18.0.4]

Signed-off-by: kuritka kuritka@gmail.com

kuritka commented 2 years ago

@ytsarev , good question:

Before WRR we had such Corefile

    cloud.example.com:5353 {
        errors
        health
        ready
        loadbalance round_robin <---
        prometheus 0.0.0.0:9153
        forward . /etc/resolv.conf
        k8s_crd {
            filter k8gb.absa.oss/dnstype=local
            negttl 10
        }
    }

^^^ originally it only balanced the RRR. Currently we deployed modified Corefile:

    cloud.example.com:5353 {
        errors
        health
        ready
        prometheus 0.0.0.0:9153
        forward . /etc/resolv.conf
        k8s_crd {
            filter k8gb.absa.oss/dnstype=local
            negttl 10
            loadbalance weight  <---
        }
    }

loadbalance weight behaves as mentioned in PR description (WRR by default, RRR if GSLB configuration has not weight section; or SKIP when endpoint is corrupted). Maybe we can remove SKIP and use RRR instead, so it always balance somehow, even when endpoint is corrupted.

PS: about SKIP. SKIP is established in situation when targets differs from labels. It usually happen during sync. It takes TTL before change may propagate.

# SKIP situation
apiVersion: externaldns.k8s.io/v1alpha1
kind: DNSEndpoint
metadata:
  name: weight-eu-5-us-5
  labels:
    k8gb.absa.oss/dnstype: local
  annotations:
    k8gb.absa.oss/dnstype: local
spec:
  endpoints:
  - dnsName: weight-eu-5-us-5.example.org
    recordTTL: 30
    recordType: A
    labels:
      strategy: roundRobin
      weight-eu-0-5: 172.18.0.3
      weight-eu-1-5: 172.18.0.4
#      weight-us-0-5: 172.18.0.5 <--- by removing those two incocistence happens. DNS answer will
#      weight-us-1-5: 172.18.0.6  <-- have 4 records while weight settings cover only two
    targets:
    - 172.18.0.5
    - 172.18.0.6
    - 172.18.0.3
    - 172.18.0.4

kuritka commented 2 years ago

PS2:

Originally I was also thinking about adding strategy:

    cloud.example.com:5353 {
        errors
        health
        ready
        prometheus 0.0.0.0:9153
        forward . /etc/resolv.conf
        k8s_crd {
            filter k8gb.absa.oss/dnstype=local
            negttl 10
            loadbalance random  <---
        }
    }

But at the moment I don't see use-case for that. But if you see, let me know please. I can add in different PR.

kuritka commented 1 year ago

Hi, thx for review. Would be documented, just not sure if I understand the documentation will be extended at k8gb level ?

ytsarev commented 1 year ago

@kuritka By k8gb level, I mean the main project documentation we have at k8gb.io. In contrast to the documentation located in this specific repository.

k8gb-io / coredns-crd-plugin

[RRR] random round robin #45