DNS Support in Nomad Service Discovery

mikenomitch commented 2 years ago

Community Feedback Wanted!

Proposal

Currently, native Nomad service discovery can expose service information (address, port, tags, etc) via the services api or services CLI command, and tasks can use service information via the template stanza's nomadServices keyword.

It would be nice if in addition to these methods, Nomad services could be fetched via a DNS server. This could b automatically added via the Nomad client (and configured in client config?) or perhaps we could run CoreDNS with a plugin as a system job, and make this easy to deploy.

The Nomad team would love your feedback as we explore this technically!

Use-cases

General DNS lookups for Nomad
Load-balancer integration
Simple DNS forwarding

Do you have use cases in mind for this? Things that can't easily be achieved with the template stanza?

Please let us know! Leaving a comment about how and why you would use this and what your ideal UX would be can help ensure we design this well!

mr-karan commented 2 years ago

I'm in favor of running coredns as a system job. coredns can do a lot of things specified in the Use Cases: load balancing, health checks, export prometheus metrics, caching etc. Otherwise we may end up building all these things in Nomad's core codebase.

In addition to running a system job, we can also look at building a coredns plugin similar to kubernetes and use the service APIs to query the records based on a schema similar to K8s: namespace.pod.cluster.local -> (something like namespace.job.group.task) ? This opens up a ndots:5 can of worms (https://k8s.af/) but yeah probably needs more thought.

In K8s, the kubelet is responsible for placing the coredns's cluster IP service in /etc/resolv.conf and other search entries, so a bit confused how that'll work here? Would the nomad client do it while placing the allocs? And what happens to tasks specified with raw_exec, would they use the /etc/resolv.conf of the host?

Leaving a comment about how and why you would use this and what your ideal UX would be can help ensure we design this well!

IMHO, for me a better UX (over templating config file - which is also cumbersome if the app uses ENV vars)) is just being able to specify db:5432 or redis:6379. Here db or redis would be the service names in that particular namespace, and any DNS server can respond with the allocation's IP.

brahima commented 1 year ago

Hello,

With the great new features/enhancements of Nomad 1.4 (beta as i'm writing this), that would be great to have a simple DNS resolution feature.

That would allow this kind of thing (see code below). My exemple is adapted for HaProxy configuration file with consul DNS resolution feature.

backend myservice
    balance roundrobin
    server-template myservice 3 _myservice._tcp.service.nomad resolvers nomad resolve-opts allow-dup-ip resolve-prefer ipv4 check

resolvers nomad
   nameserver nomad {{ env "NOMAD_DNS_ADDR" }}:8600
   accepted_payload_size 8192
   hold valid 5s

So here having a nomad DNS at _NOMAD_DNSADDR would allow the service _myservice._tcp.service.nomad to be resolved by HaProxy.

I posted a question here at hashicorp discuss about this to see what are my alternatives before this great features lands in nomad.

Regards

mr-karan commented 1 year ago

Making another case for DNS based discovery. In NGINX, the proxy_pass directive "caches" the value of upstream and never reloads it, even after sending a SIGHUP.

Though this method enables us to choose the load‑balancing algorithm and configure health checks, it still has the same drawbacks with respect to start, reload, and TTL as the previous method.

https://www.nginx.com/blog/dns-service-discovery-nginx-plus/

For eg, this is the updated config after the app was restarted and Nomad allocated a new port:

upstream app {
  server 10.0.5.82:31477;
}

location / {
    proxy_pass http://app;
}

However, in the logs, even after reloading I can still see it's trying to connect to 10.0.5.82:24774 instead of 10.0.5.82:31477:

2023/01/02 05:17:53 [error] 34#34: *45 connect() failed (111: Connection refused) while connecting to upstream, client: 10.0.5.228, server: app.local, request: "GET / HTTP/1.1", upstream: "http://10.0.5.82:24774/", host: "10.0.5.82:8080"

There's a way to specify resolver and custom TTLs in NGINX (for eg resolver 10.0.0.2 valid=10s;), however since the services aren't exposed via a DNS interface, it's not possible to utilize them with NGINX.

mikenomitch commented 1 year ago

@mr-karan et al,

We continue to think the coredns plugin would be a great idea, but we don't have it prioritized imminently. If anybody from the community is interested in driving the development of a plugin, the Nomad team would love to provide some support and point people towards it once it is ready. Unfortunately, we aren't able to drive development on it for a while (just have lot's to do!).

I got started on some plugin code in my free time, but only got a very basic skeleton of a Nomad CoreDNS plugin written. If anybody wants to use this code as a starting point, feel free - https://github.com/mikenomitch/coredns-nomad - no attribution necessary.

mr-karan commented 1 year ago

@mikenomitch Whew, I'd love to pick this up. I filed an issue just yesterday (https://github.com/coredns/coredns/issues/5829) in coredns, but yeah would first make sense to have the plugin ready externally.

I'll look into the code, thanks for a reference point :)

mikenomitch commented 1 year ago

No problem. Just a warning that it's messy and basically just the example plugin with a Nomad client imported now. But should hopefully save you or others a little bit of toil.

I found looking at some of the ServeDNS functions in other plugins' source code helpful. Might be a good place to start.

mr-karan commented 1 year ago

Update: I've released the plugin here: https://github.com/mr-karan/coredns-nomad/.

Also sent a PR to upstream to see if it can be merged in coredns :)

aofei commented 1 year ago

Hi @mr-karan, thanks for opening https://github.com/coredns/coredns/pull/5833.

I was thinking maybe the Nomad CoreDNS plugin could do something more powerful. Like querying Consul services with DNS, I always enjoy its <datacenter> and <tag> filters. Maybe the query format of Nomad CoreDNS plugin can be like this:

[<tag>.]<service>.service.<namespace>[.<datacenter>].nomad

And I'm sure both filters can be implemented using the existing Nomad API.

(Comment here instead of https://github.com/coredns/coredns/pull/5833 since this issue has more engagement and details.)

m1keil commented 1 year ago

Until there's a solution for this, a bit of Go template foo can go a long way: https://gist.github.com/m1keil/d0ef68c4277712a5b0ce2cf74743f18e

ttys3 commented 7 months ago

so, currently we have:

https://github.com/mr-karan/coredns-nomad 24 star

https://github.com/ituoga/coredns-nomad 1 star

https://github.com/mikenomitch/coredns-nomad 0 star

issue https://github.com/coredns/coredns/issues/5829 a pr https://github.com/coredns/coredns/pull/5833

a gist: https://gist.github.com/m1keil/d0ef68c4277712a5b0ce2cf74743f18e

I do not know which one is the best as a workaround

update: just tried edit plugin.cfg and append

nomad:github.com/mr-karan/coredns-nomad

and it works.

❯ ./coredns -plugins | grep -i nomad
  dns.nomad

edit /etc/coredns/Corefile

remember to replace SET-YOUR-REAL-LOCAL-DNS-HERE to your own local dns

YOUR-MACHINE'S-REAL-LOCAL-LAN-IP to your machine's LAN ip

.:53 {
  #log
  errors
  cache
  whoami

  # https://coredns.io/plugins/bind/
  bind YOUR-MACHINE'S-REAL-LOCAL-LAN-IP
  bind 127.0.0.1
  #bind lo

  # https://coredns.io/plugins/forward/
  forward . SET-YOUR-REAL-LOCAL-DNS-HERE:53
  #forward . 127.0.0.1:5354

  # https://coredns.io/plugins/metrics/
  # The metrics path is fixed to /metrics
  prometheus localhost:9153
  health :8053
}

nomad:53 {
  log
  errors
  bind YOUR-MACHINE'S-REAL-LOCAL-LAN-IP
  bind 127.0.0.1
    nomad {
        address http://127.0.0.1:4646
    }
    cache 5
}

sudo systemctl restart coredns

edit /etc/systemd/resolved.conf

set dns to DNS=YOUR-MACHINE'S-REAL-LOCAL-LAN-IP:53

sudo systemctl restart systemd-resolved

edit nomad job file:

    service {
      name = "foo"
      provider = "nomad"
      address_mode = "alloc"

      # ... 
}

the most important config is address_mode = "alloc" and provider = "nomad"

now let's test:

dig foo.default.nomad @YOUR-MACHINE'S-REAL-LOCAL-LAN-IP -p 53

and it works!

and use default resolver test again:

dig foo.default.nomad

it also works!

and we have to use port 53 for CoreDNS, because we can not set the non-53 port DNS for a container via the driver now ( I'm using the podman driver)

slonopotamus commented 7 months ago

I'm not sure how that helps given that you not only need to know the IP but also the port that Nomad normally assigns randomly. In Kubernetes, this is solved by the fact that each Service runs on a separate IP, so you can use fixed port.

Consul DNS provides SRV records that can tell you both IP and port. But almost no software knows how to use SRV records.

ttys3 commented 7 months ago

I'm not sure how that helps given that you not only need to know the IP but also the port that Nomad normally assigns randomly. In Kubernetes, this is solved by the fact that each Service runs on a separate IP, so you can use fixed port.

Consul DNS provides SRV records that can tell you both IP and port. But almost no software knows how to use SRV records.

like I said, the most important config is address_mode = "alloc"

it will tell nomad to report the container bridge IP to nomad, not the machine node IP. so, it does not care about the random port nomad assigned, because we do not need to used that.

all services in the cluster, communicate with the nomad CNI network. not the machine's.

so we do not use the random port at all.

Before this, we were using nomad + consul catalog + flannel (layer 3 network fabric) + consul DNS + coredns + hacked podman driver (see https://github.com/hashicorp/nomad-driver-podman/pull/304/files)

it is quit stable, and it is on production for years.

the reason why we have to use a hacked podman driver is because, all the containers (pods) on all machine (node), they must query the service name using the same coredns

coredns will proxy *.consul name resolve to consul DNS. and others to forward to public DNS servers.

SamMousa commented 5 months ago

Regardless of the type of record (SRV / A) could / should this not be a separate service that you deploy on the cluster? All it has to do is essentially be a translation proxy that translates DNS requests to Nomad API calls (optionally with caching etc)..

NiklasPor commented 3 months ago

Regardless of the type of record (SRV / A) could / should this not be a separate service that you deploy on the cluster? All it has to do is essentially be a translation proxy that translates DNS requests to Nomad API calls (optionally with caching etc)..

That's also what I'm currently experimenting with. We've got a modified version of the mentioned gist running with a coredns container in a job / service. Then we grab the IP of the dns service in a template, put it into a environment variable and feed that into the consuming jobs –> but I'm not sure whether this setup would survive a crash of the DNS host.

I guess that's why it's better if every agent takes care of the DNS setup itself?

I also couldn't yet figure out how the address_mode = "alloc" would work without mentioning explicit ports.

hashicorp / nomad