nokia / danm

TelCo grade network management in a Kubernetes cluster
BSD 3-Clause "New" or "Revised" License
374 stars 81 forks source link

DNS resolution for services managed by Svc Watcher #203

Closed navjotsingh83 closed 4 years ago

navjotsingh83 commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

bug feature

What happened: Unable to do a DNS lookup on the identity of a POD (deployed as a stateful set) when service discovery is via SvcWatcher

What you expected to happen: SvcWatcher claims parity with K8s service discovery and adds much more, so this use case should be supported

How to reproduce it: My application is deployed as a stateful set with TWO PODs say pod-0 and pod-1 as the hostnames/podnames. The pods has two interfaces, eth0 as external and eth1 as internal. Since I wanted a service discovery for eth1, I defined a headless service with danm annotations, something like:

_apiVersion: v1 kind: Service metadata: name: svcname annotations: danm.k8s.io/selector: '{"app.kubernetes.io/name":"xxxx"}' danm.k8s.io/network: default (Here default means internal calico network) spec: clusterIP: None selector: app.kubernetes.io/name: xxxx ports:

Now, once if I log into another pod and do nslookup on the service name, it works. _>>nslookup svcname Server: 172.16.1.5 Address: 172.16.1.5#53

Name: svcname.default.svc.cluster.local Address: 192.168.89.138 Name: svcname.default.svc.cluster.local Address: 192.168.89.135_

However, if I want to access the individual PODs (that's the use-case of my application), then it fails:

nslookup svcname.pod-0 Server: 172.16.1.5 Address: 172.16.1.5#53

** server can't find svcname.pod-0: NXDOMAIN

Anything else we need to know?: This use case works perfectly when using the default K8s service discovery(obviously on eth0) as for stateful sets we can do a DNS on the individual PODs. Environment:

Levovar commented 4 years ago

selector: app.kubernetes.io/name: xxxx

your Service is headless, but not selectorless as it should be.

that being said Pod-identity based service discovery might not be supported at the moment, I will look into that. but please retry first with a proper Service definition

Levovar commented 4 years ago

so, this is an Endpoint subset record the default in-built service controller creates for the SSET:

This is what we create:

as you can see, there are some differences, mainly hostName vs. nodeName, and missin UID. It is true Services worked differently around 1.12-1.13 when we created the svcwatcher. I assume the new CoreDNS depends on eiher UID, or nodeName parameter when creating the Pod related A-records I will update our object management to include these parameters, and let's hope CoreDNS does the rest

Levovar commented 4 years ago

@navjotsingh83 with the modification I think Pod name based subdomains work as expected. Only tested with Kube-DNS, but for a headless&selectorless DANM StatefulSet Service both type of queries give the correct result: [cloudadmin@controller-1 ~]$ nslookup vnf-internal-processor-sset-danm.default.svc.nokia.net ../../../../lib/isc/unix/net.c:594: probing sendmsg() with IPV6_TCLASS=b8 failed: No route to host Server: 172.31.3.154 Address: 172.31.3.154#53

Name: vnf-internal-processor-sset-danm.default.svc.nokia.net Address: 10.240.1.100 Name: vnf-internal-processor-sset-danm.default.svc.nokia.net Address: 10.240.1.101 Name: vnf-internal-processor-sset-danm.default.svc.nokia.net Address: 10.240.1.102 Name: vnf-internal-processor-sset-danm.default.svc.nokia.net Address: 10.240.1.103 Name: vnf-internal-processor-sset-danm.default.svc.nokia.net Address: 10.240.1.104

[cloudadmin@controller-1 ~]$ nslookup internal-processor-set-0.vnf-internal-processor-sset-danm.default.svc.nokia.net ../../../../lib/isc/unix/net.c:594: probing sendmsg() with IPV6_TCLASS=b8 failed: No route to host Server: 172.31.3.154 Address: 172.31.3.154#53

Name: internal-processor-set-0.vnf-internal-processor-sset-danm.default.svc.nokia.net Address: 10.240.1.100