hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.31k stars 4.42k forks source link

Upstream connection seems unstable in Mesh network #15345

Closed ventaubain closed 1 year ago

ventaubain commented 1 year ago

Overview of the Issue

I have 2 services in a Mesh network. Services are in a Nomad cluster. The service (1) is connected to the service (2) by an upstream on the localhost:7000. (2) is a socks5 proxy and has 3 replicas (default load balancing policy and this one has a correct comportment) and the request from (1) is socks5-based. At the beginning, both work correctly together but, after few times (not constant and no reason detected yet), the upstream connection seems to fail.

After few times, if I do a Curl (curl --socks5-hostname localhost:7000 <URL>), I obtain curl: (97) Unable to receive initial SOCKS5 response.. If I restart the sidecar of service (1), the connection is retablished and the upstream correctly works. If there is only one container and not 3 under the upstream, the connection seems not stable too. The problem could be correlated to the load balancing.

The service (1) has an another upstream to a Mysql container and has no (detected) problem.

mesh When the active connections (envoy_cluster_upstream_cx_active) is higher than 1k02, the proxy stops and never re-works except if I relaunch it. I use default sidecar configuration so I have:

    Limits = {
      MaxConnections = 512
      MaxPendingRequests = 512
      MaxConcurrentRequests = 512
    }

It can be the reason but why crash with 1k02 and not 512 ?

Reproduction Steps

Steps to reproduce this issue, eg:

Consul info for both Client and Server

Operating system and Environment details

Log Fragments

I don't find explicit error in log.

ventaubain commented 1 year ago

Capture d’écran du 2022-11-14 17-22-48

There are some "destroy" connections from the backend. It seems to be the problem and not "a bug". How works the sidecar about this "destroy" comportment ?

ventaubain commented 1 year ago

The problem seems to be an intern problem of the service. Not sidecar or Consul. I close the issue. Sorry for the mistake.