hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.89k stars 1.95k forks source link

Nomad Consul connect "dial tcp: lookup unix: Temporary failure in name resolution" #7977

Open sparkacus opened 4 years ago

sparkacus commented 4 years ago

Nomad v0.11.1 (b43457070037800fcc8442c8ff095ff4005dab33)

I've run into the following problem:

May 15 11:18:25 ip-10-1-89-245 nomad[39744]:     2020-05-15T11:18:25.141Z [ERROR] client.alloc_runner.runner_hook: error connecting to grpc: alloc_id=0de4add2-2b81-fed5-7cd6-a05a168e26d6 error="dial tcp: lookup unix: Temporary failure in name resolution" dest=unix:8502

Example Consul client config:

# Support for Consul Connect
ports {
  "grpc" = 8502
}

connect {
  enabled = true
}

addresses {
    http = "unix:///tmp/consul.sock"
    grpc = "unix:///tmp/consul-grpc.sock"
}

Example Nomad config:

consul {
  address = "unix:///tmp/consul.sock"
}

From my understanding, to enable GRPC in Consul I must define a port for it.

Does Nomad support GRPC UNIX socket?

tgross commented 4 years ago

Nomad does support accessing Consul's HTTP interface over a Unix socket: https://www.nomadproject.io/docs/configuration/consul/#address

But for Connect's gRPC, I took a look a consulsock_hook.go#L146-L159 and it looks like we're expecting an address:port pair. I'm not sure whether this is an intentional limitation, so I'm going to tag-in my colleagues @nickethier or @shoenig to see if they have more insight.

sparkacus commented 4 years ago

@tgross Thanks.

I had to revert back to using TCP in the end otherwise I can't use Consul Connect in Nomad.

Would it be an option to expand the Nomad Consul stanza to allow defining the http and grpc addresses in the same way you can in the Consul config?

shoenig commented 4 years ago

From my understanding, to enable GRPC in Consul I must define a port for it.

I believe it should be possible to use gRPC with Consul over a unix socket

http, https and grpc all support binding to a Unix domain socket.

https://www.consul.io/docs/agent/options#addresses

I think you may have the right idea @sparkacus, right now we just assume Consul is configured with all its listeners bound to the same address, which might not be the case.

sparkacus commented 4 years ago

If it is possible, that would be great.

I planned to use Traefik which supports connecting to Consul over a UNIX socket, which I can then pass through as a volume mount. This means the Traefik tasks do not need to use the host network to be able to communicate with Consul (I know there are workarounds).

jhitt25 commented 3 years ago

We are using Consul 1.10.1 and Nomad 1.1.3 and just hit this exact same issue with our cluster. We expose Consul's HTTP API via local socket only as well.

Our consul stanza read: consul { address = "unix:///run/consul/consul.sock" }

This generated errors in the nomad log like: {"@level":"error","@message":"error connecting to grpc","@module":"client.alloc_runner.runner_hook","@timestamp":"2021-08-24T14:36:29.123890-05:00","alloc_id":"b6466363-52ee-e3a2-9865-932fac5ccf81","dest":"unix:8502","error":"dial tcp: lookup unix on 10.60.10.140:53: no such host"}

Explicitly setting grpc_address as: consul { address = "unix:///run/consul/consul.sock" grpc_address = "127.0.0.1:8502" }

let everything start working properly. I have yet to try moving GRPC to a unix socket as well, but i doubt it will be a problem. This seems to just be an issue of an improper default when consul.address is not an ip:port pair and grpc_address is unspecified.