submariner-io / lighthouse

DNS service discovery across connected Kubernetes clusters.
https://submariner-io.github.io/architecture/service-discovery/
Apache License 2.0
100 stars 35 forks source link

Submariner does not add LightHouse DNS entry in corefile section of configmap in case of RKE2 cluster. #1602

Open manojgop opened 2 months ago

manojgop commented 2 months ago

What happened:

Submariner does not add LightHouse DNS entry in configmap "corefile" section in case of RKE2 cluster. For RKE2, I see "rke2-coredns" instead of core-dns. rke2-coredns is NOT configured to forward requests for domain clusterset.local to Lighthouse CoreDNS Server in the cluster making the query. I had to edit this config file manually in "corefile" section in all clusters to make it work.

The output of kubectl -n kube-system describe configmap rke2-coredns-rke2-coredns is as following. The forward rule is present in lighthouse.server section. But that didn't work in case of RKE2.

Data
====
Corefile:
----
.:53 {
    errors
    health  {
        lameduck 5s
    }
    ready
    kubernetes   cluster.local  cluster.local in-addr.arpa ip6.arpa {
        pods insecure
        fallthrough in-addr.arpa ip6.arpa
        ttl 30
    }
    prometheus   0.0.0.0:9153
    forward   . /etc/resolv.conf
    cache   30
    loop
    reload
    loadbalance
}
lighthouse.server:
----
clusterset.local:53 {
    forward . 10.43.180.127
}

nslookup nginx.default.svc.clusterset.local returned server can't find nginx.default.svc.clusterset.local: NXDOMAIN

I had to manually edit config map and keep following section in the corefile section of the configmap

clusterset.local:53 {
    forward . 10.43.180.127
} 
Data
====
Corefile:
----
#lighthouse-start
clusterset.local:53 {
    forward . 10.43.180.127
}
#lighthouse-end
.:53 {
    errors
    health  {
        lameduck 5s
    }
    ready
    kubernetes   cluster.local  cluster.local in-addr.arpa ip6.arpa {
        pods insecure
        fallthrough in-addr.arpa ip6.arpa
        ttl 30
    }
    prometheus   0.0.0.0:9153
    forward   . /etc/resolv.conf
    cache   30
    loop
    reload
    loadbalance
}

Adding the rules in lighthouse.server section seems to be a issue. Looks like RKE2 is expecting the rules in Corefile section

What you expected to happen:

Lighthouse DNS to work for exported services in RKE2 clusters

How to reproduce it (as minimally and precisely as possible):

Try using submariner with RKE2 cluster and export service

Anything else we need to know?: Check Slack for more details slack comments

Environment:

dfarrell07 commented 2 months ago

ACK, thanks for the report @manojgop. This does seem to be an issue. @vthapar can provide some details.

vthapar commented 2 months ago

CustomDNSCONfig was added at the time for an issue with Azure/AKS clusters where they required DNS configuration to be in a separate file and xyz.server format. That is why we use lighthouse.server. But rke2 expects it in the Corefile section itself. This will require some work, and potentially a new flag without breaking any existing AKS deployments.