Closed pratikbin closed 3 years ago
Hi @pratikbalar , sorry your'e having trouble!
I suspect you still need to open port 8502 on Consul, which is a requirement for making Connect work.
https://www.nomadproject.io/docs/integrations/consul-connect#consul
(fwiw, imho Consul should automatically listen on 8502 if connect is enabled, but currently it does not)
Thanks for the quick reply @shoenig ,
Configured grpc but now getting this in consul FYI I'm using traefik with consul catalog, if you can convince me to shift to envoy then I'm up for it :smile:
Apr 23 22:35:16 ctos consul[18346]: 2021-04-23T22:35:16.372+0530 [ERROR] agent.envoy: Error handling ADS stream: error="rpc error: code = InvalidArgument desc = Envoy 1.11.2 is too old and is not supported by Consul"
Apr 23 22:35:21 ctos consul[18346]: 2021-04-23T22:35:21.628+0530 [WARN] agent: Check socket connection failed: check=service:_nomad-task-69081a64-8417-07ef-f004-3ff2f972ad38-group-api-count-api-9001-sidecar-proxy:1 error="dial tcp 192.168.43.54:22377: connect: connection refused"
Apr 23 22:35:21 ctos consul[18346]: 2021-04-23T22:35:21.629+0530 [WARN] agent: Check is now critical: check=service:_nomad-task-69081a64-8417-07ef-f004-3ff2f972ad38-group-api-count-api-9001-sidecar-proxy:1
Apr 23 22:35:22 ctos consul[18346]: 2021-04-23T22:35:22.581+0530 [ERROR] agent.envoy: Error handling ADS stream: error="rpc error: code = InvalidArgument desc = Envoy 1.11.2 is too old and is not supported by Consul"
Apr 23 22:35:24 ctos consul[18346]: 2021-04-23T22:35:24.332+0530 [WARN] agent: Check socket connection failed: check=service:_nomad-task-81e3bc69-6cdb-7a01-aeae-8349109d6853-group-dashboard-count-dashboard-9002-sidecar-proxy:1 error="dial tcp 192.168.43.54:20646: connect: connection refused"
Apr 23 22:35:24 ctos consul[18346]: 2021-04-23T22:35:24.332+0530 [WARN] agent: Check is now critical: check=service:_nomad-task-81e3bc69-6cdb-7a01-aeae-8349109d6853-group-dashboard-count-dashboard-9002-sidecar-proxy:1
Apr 23 22:35:31 ctos consul[18346]: 2021-04-23T22:35:31.530+0530 [ERROR] agent.envoy: Error handling ADS stream: error="rpc error: code = InvalidArgument desc = Envoy 1.11.2 is too old and is not supported by Consul"
Apr 23 22:35:31 ctos consul[18346]: 2021-04-23T22:35:31.629+0530 [WARN] agent: Check socket connection failed: check=service:_nomad-task-69081a64-8417-07ef-f004-3ff2f972ad38-group-api-count-api-9001-sidecar-proxy:1 error="dial tcp 192.168.43.54:22377: connect: connection refused"
Apr 23 22:35:31 ctos consul[18346]: 2021-04-23T22:35:31.629+0530 [WARN] agent: Check is now critical: check=service:_nomad-task-69081a64-8417-07ef-f004-3ff2f972ad38-group-api-count-api-9001-sidecar-proxy:1
Apr 23 22:35:32 ctos consul[18346]: 2021-04-23T22:35:32.417+0530 [ERROR] agent.envoy: Error handling ADS stream: error="rpc error: code = InvalidArgument desc = Envoy 1.11.2 is too old and is not supported by Consul"
Apr 23 22:35:34 ctos consul[18346]: 2021-04-23T22:35:34.333+0530 [WARN] agent: Check socket connection failed: check=service:_nomad-task-81e3bc69-6cdb-7a01-aeae-8349109d6853-group-dashboard-count-dashboard-9002-sidecar-proxy:1 error="dial tcp 192.168.43.54:20646: connect: connection refused"
Apr 23 22:35:34 ctos consul[18346]: 2021-04-23T22:35:34.333+0530 [WARN] agent: Check is now critical: check=service:_nomad-task-81e3bc69-6cdb-7a01-aeae-8349109d6853-group-dashboard-count-dashboard-9002-sidecar-proxy:1
Apr 23 22:35:38 ctos consul[18346]: 2021-04-23T22:35:38.641+0530 [ERROR] agent.envoy: Error handling ADS stream: error="rpc error: code = InvalidArgument desc = Envoy 1.11.2 is too old and is not supported by Consul"
Apr 23 22:35:41 ctos consul[18346]: 2021-04-23T22:35:41.630+0530 [WARN] agent: Check socket connection failed: check=service:_nomad-task-69081a64-8417-07ef-f004-3ff2f972ad38-group-api-count-api-9001-sidecar-proxy:1 error="dial tcp 192.168.43.54:22377: connect: connection refused"
We're actually huge fans of traefik! They're about to add a native integration with Consul Connect, something I'm at least super excited for :slightly_smiling_face:
The Envoy 1.11.2 is too old and is not supported by Consul
is unexpected, Nomad v1.0+ should automatically use the newest version of Envoy supported by the Consul agent on each node. (Note that all clients, not just servers need to be up to date)
A few reasons that might not happen:
meta.connect.sidecar_image
is set to that version of envoysidecar_task
with that version of envoyDo any of those conditions match your environment?
We're actually huge fans of traefik! They're about to add a native integration with Consul Connect, something I'm at least super excited for slightly_smiling_face
That's cool, wop wop traefik
Nomad clients are not v1.0 or higher I'm using a single machine with the same version of the client and server of nomad which is v1.0.4
same with consul v1.9.4
meta.connect.sidecar_image is set to that version of envoy
what should I configure this to?
you're using a custom sidecar_task with that version of envoy
I'm just following this https://www.nomadproject.io/docs/integrations/consul-connect where no explicit envoy added in job config
Interesting, can you show the output when making this curl request to the Consul client? E.g.,
$ curl -s localhost:8500/v1/agent/self | jq -r .xDS
{
"SupportedProxies": {
"envoy": [
"1.16.2",
"1.15.3",
"1.14.6",
"1.13.7"
]
}
}
And with trace level logging enabled, what do you see in the Nomad client log line that starts with,
setting task envoy image
You can enable trace logging with -log-level=TRACE
or in agent config.
what should I configure this to?
The meta.connect.sidecar_image
can be explicitly set to any image that runs envoy, typically one of the official ones published to docker hub (it would need to be a version of Envoy supported by your version of Consul).
Doing so shouldn't be necessary though; Nomad v1.0 and later query Consul using that /agent/self
endpoint above to determine which version of Envoy to use, falling back to that v1.11.2 version if Consul is too old to include the xDS
blob in the response payload.
ill try this today
wait a minute!! now it's running, thanks @shoenig i guess :smile:
I'll reopen or comment here if i found something/stuck
Glad it works @pratikbalar !
Actually I think I finally realized what happened - when you first launched Consul without the grpc
port set, Consul will not include the xDS
blob in the /v1/agent/self
response. When Nomad sees the lack of that blob, it defaults to that outdated version of Envoy as described above.
$ cat consul.hcl
connect {
enabled = true
}
data_dir = "/tmp/consul"
bind_addr = "127.0.0.1"
$ curl -s localhost:8500/v1/agent/self | jq -r .xDS
null
After fixing the port problem, unless the job is recycled Nomad will relaunch the task in-place without making additional queries to Consul, since it just assumes it would have gotten the same response, thus the task just keeps failing.
If someone stumbles upon this:
I had the same issue and realized that one of my consul clients was missing the port configuration for gRPC
.
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Nomad version
Nomad v1.0.4 (9294f35f9aa8dbb4acb6e85fa88e3e2534a3e41a)`
Consul v1.9.4 Revision 10bb6cb3b
CNI v9.0.0
Operating system and Environment details
Arch Linux Manjaro 21 on Intel i510xxx 16GB RAM
Issue
https://www.nomadproject.io/docs/integrations/consul-connect When testing consul connect following above official guide on local getting
Reproduction steps
Testing nomad and consul with this configs
consul
Expected Result
Sidecar should run as per official docs https://www.nomadproject.io/docs/integrations/consul-connect
Actual Result
Job file (if appropriate)
Nomad Server logs (if appropriate)
Nomad Client logs (if appropriate)
Consul