hashicorp / terraform-aws-consul

A Terraform Module for how to run Consul on AWS using Terraform and Packer
Apache License 2.0
398 stars 484 forks source link

Module does not support well-behaved reverse lookup under systemd-resolved #155

Open tpdownes opened 4 years ago

tpdownes commented 4 years ago

The current state of consul support for systemd-resolved is a bit funky in that forward resolution works well and simply, while reverse lookup of IPs will, with low percentage, return .consul domains or whatever the other DNS resolvers say. e.g. https://github.com/hashicorp/consul/issues/6462.

https://github.com/hashicorp/consul/pull/6731 provides a solution for ensuring that all reverse lookups of an IP address known to consul results in a .consul domain.

It has the added behavior (perhaps undesirable) of reverse lookup failing on IP addresses not known to consul unless one also configures the recursors option. e.g.

{
...
    "recursors": ["10.0.0.2"],
...
}

I think it would be reasonable to consider adopting this configuration in the consul-cluster module and intend this issue to be a starting point for the conversation.

Thanks for the module as it has facilitated very rapid progress on my side!

brikis98 commented 4 years ago

Thanks for reporting!

It has the added behavior (perhaps undesirable) of reverse lookup failing on IP addresses not known to consul

What's the default behavior?

unless one also configures the recursors option. e.g.

What value would you plug into recursors?

tpdownes commented 4 years ago

The behavior of the documented systemd-resolved solution is described in https://github.com/hashicorp/consul/issues/6462. This is what's now implemented in the module.

TL;DR: most of the time reverse resolution of all IPs goes through whatever the system has been configured to use. A small percentage of the time reverse resolution goes through consul. That is to say, reverse lookup is not predictable in the documented systemd-resolved solution.

https://github.com/hashicorp/consul/pull/6731 is predictable in that, for a set of configured subnets, systemd-resolved will reliably reverse lookup via the consul agent and, for other subnets, it will probably revert to the documented behavior (most of the time, the system DNS, some of the time, consul). The cost is that the consul agent will fail to reverse lookup for IPs within the configured subnets but not known in the .consul domain.

It's perfectly reasonable not to put consul agents on every machine in one's infrastructure.

You can solve this by placing a DNS server with appropriate reverse lookup capability into recursors. In most situations, this probably means whatever DNS server you use for forward lookups outside of the .consul domain. i.e., whatever DNS servers you got from DHCP / static config.

In this case, systemd-resolved will return .consul domains when performing a reverse lookup on IPs in the consul network and will return whatever the upstream DNS server says for other IPs.

Make sense?

brikis98 commented 4 years ago

Got it, thanks for the context. So, to summarize, you are proposing that for systems using systemd-resolved we update the run-consul script to be able to add the following to resolved.conf:

DNS=127.0.0.1
Domains=~consul ~<CIDR>.in-addr.arpa

Where <CIDR> is passed in via a new param to the script... As well as add the following to the consul config:

    "recursors": ["<DNS_SERVER>"],

Where <DNS_SERVER> is also passed in via a new param to the script.

Is that right?

tpdownes commented 4 years ago

I'm definitely proposing the first thing, with caveat that it's actually the "backwards truncated CIDR" and it should be opt-in. There could also be multiple CIDRs.

The second thing, I'm outlining the pros/cons. You could imagine mimicking the dnsmasq behavior of using servers from /etc/resolve.conf by having run-consul automatically set recursors by parsing the output of systemd-resolve (or resolvectl on ultra-contemporary systems) or networkctl. I'm not really what is right these days.

brikis98 commented 4 years ago

(sorry for delay, we were all away for a company offsite)

Roger. I think if both options are opt-in based on passed-in params, this makes sense to add. A PR is very welcome!