Closed lbik closed 11 months ago
I can see that I'm getting same error with my terminating gateway
Hi @lbik 👋
Unfortunately I don't immediately see anything wrong with your setup 🤔
You mentioned TLS being enabled, do you mean that this worked without TLS?
Hi,
Hi @lgfa29
Im really sorry for my late response.
When I tried to reproduce this issue in our unsecured cluster i found out that everything works as expected when default envoy image is pulled. After that I checked our private docker registry what kind of envoy image we use and envoy:distroless has been spotted. So TLS had no effect.
No worries @lbik, I'm glad you were able to fix the problem.
Hello guys,
I'm facing an issue with sidecar proxy in a cluster with TLS enabled. In a situation, where I try to deploy a service, which should be connected via terminating gateway to a service, which is outside the service mesh. I have registered an external service, then I have deployed a job with a terminating gateway service and with my service which i want to deploy with a sidecar proxy.
Job.hcl
```job "testaccount1" { datacenters = ["dc1"] type = "service" group "gateway" { network { mode = "bridge" } service { name = "sso-gateway" connect { gateway { proxy {} } terminating { service { name = "sso" } } } sidecar_task { config { image = "xxxxxxxxxxx/library/envoy" } } } } } group "testaccount1" { count = 1 network { mode = "bridge" port "http" { to = 8080 static = 8080 } } service { name = "testaccount1" port = "http" provider = "consul" connect { sidecar_service { proxy { upstreams { destination_name = "sso" local_bind_port = 443 } } } sidecar_task { config { image = "xxxxxxxxxx/library/envoy" } } } } task "testaccount1" { driver = "docker" env { } config { image = "xxxxxxxxx/account" ports = ["http"] auth { username = xxxxx password = xxxxx } } } } } ```
This snippet is able to deploy terminating gateway and my specific service with its sidecar proxy. Consul's health check on that sidecar proxy is giving me an error
dial tcp 10.4.5.26:25299: connect: connection refused
. In an envoy sidecar logs i can see thisenvoy logs
``` [2023-10-03 13:32:50.415][1][info][admin] [source/server/admin/admin.cc:66] admin address: 127.0.0.2:19001 [2023-10-03 13:32:50.416][1][info][config] [source/server/configuration_impl.cc:131] loading tracing configuration [2023-10-03 13:32:50.416][1][info][config] [source/server/configuration_impl.cc:91] loading 0 static secret(s) [2023-10-03 13:32:50.416][1][info][config] [source/server/configuration_impl.cc:97] loading 1 cluster(s) [2023-10-03 13:32:50.467][1][info][config] [source/server/configuration_impl.cc:101] loading 0 listener(s) [2023-10-03 13:32:50.467][1][info][config] [source/server/configuration_impl.cc:113] loading stats configuration [2023-10-03 13:32:50.468][1][info][runtime] [source/common/runtime/runtime_impl.cc:463] RTDS has finished initialization [2023-10-03 13:32:50.468][1][info][upstream] [source/common/upstream/cluster_manager_impl.cc:221] cm init: initializing cds [2023-10-03 13:32:50.468][1][warning][main] [source/server/server.cc:802] there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections [2023-10-03 13:32:50.469][1][info][main] [source/server/server.cc:923] starting main dispatch loop [2023-10-03 13:33:29.302][1][warning][config] [./source/common/config/grpc_stream.h:191] DeltaAggregatedResources gRPC config stream to local_agent closed since 38s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: immediate connect error: No such file or directory [2023-10-03 13:33:45.667][1][warning][config] [./source/common/config/grpc_stream.h:191] DeltaAggregatedResources gRPC config stream to local_agent closed since 55s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: immediate connect error: No such file or directory [2023-10-03 13:34:08.535][1][warning][config] [./source/common/config/grpc_stream.h:191] DeltaAggregatedResources gRPC config stream to local_agent closed since 78s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: immediate connect error: No such file or directory [2023-10-03 13:34:16.799][1][warning][config] [./source/common/config/grpc_stream.h:191] DeltaAggregatedResources gRPC config stream to local_agent closed since 86s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: immediate connect error: No such file or directory [2023-10-03 13:34:17.366][1][warning][config] [./source/common/config/grpc_stream.h:191] DeltaAggregatedResources gRPC config stream to local_agent closed since 86s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: immediate connect error: No such file or directory ```
With those last messages in log above I started thinking that grpc is not working as it should. I have a TLS enabled in nomad and same with consul.
nomad server config
``` datacenter = "dc1" data_dir = "/opt/nomad/data" bind_addr = "0.0.0.0" server { enabled = true bootstrap_expect = 3 encrypt = "xxxxxxxxxx" } tls { http = true rpc = true ca_file = "/etc/pki/nomad/nomad-agent-ca.pem" cert_file = "/etc/pki/nomad/global-server-nomad.pem" key_file = "/etc/pki/nomad/global-server-nomad-key.pem" verify_server_hostname = true verify_https_client = true } client { enabled = false } consul { address = "127.0.0.1:8501" token = "xxxxxxxxxxxxx" grpc_ca_file = "/etc/pki/consul/consul-agent-ca.pem" grpc_address = "127.0.0.1:8503" ca_file = "/etc/pki/consul/consul-agent-ca.pem" cert_file = "/etc/pki/consul/dc1-server-consul-1.pem" key_file = "/etc/pki/consul/dc1-server-consul-1-key.pem" ssl = true } acl { enabled = true } ```
consul server config
``` data_dir = "/opt/consul" node_name = "server2" client_addr = "0.0.0.0" bind_addr = "10.4.5.22" advertise_addr = "10.4.5.22" encrypt = "xxxxxxxxxxxxxxxxx" encrypt_verify_incoming = true encrypt_verify_outgoing = true ui_config { enabled = true } rejoin_after_leave = true verify_incoming = true verify_outgoing = true verify_server_hostname = true ca_file = "/etc/pki/consul/consul-agent-ca.pem" cert_file = "/etc/pki/consul/dc1-server-consul-1.pem" key_file = "/etc/pki/consul/dc1-server-consul-1-key.pem" ports = { https = 8501 http = 8500 grpc = 8502 grpc_tls = 8503 dns = -1 } acl { enabled = true default_policy = "deny" tokens { default = "xxxxxxxxxxxxx" } } server = true bootstrap_expect = 3 log_level = "DEBUG" log_file = "/var/log/consul/" log_rotate_max_files = 30 ```
used versions
``` Nomad v1.6.2 BuildDate 2023-09-13T16:47:25Z Revision 73e372ad94033db2ceaf53468b270a31544c23fd ``` ``` Consul v1.16.2 Revision 68f81912 Build Date 2023-09-19T19:29:18Z ```
I'm not sure what could be wrong in my case.
Best Regards