hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.39k stars 4.43k forks source link

sidecar health check fail #11008

Open FunnyYish opened 3 years ago

FunnyYish commented 3 years ago

consul version :1.10.1 envoy version:1.18.3

whe use Sidecar Service Registration

{
    "Connect": {
        "SidecarService": {
            "Proxy": {
                "Upstreams": [
                    {
                        "DestinationName": "server",
                        "LocalBindPort": 1234
                    }
                ]
            }
        }
    },
    "ID": "client-82",
    "Name": "client",
    "Tags": [],
    "Address": "10.19.128.111",
    "Meta": {
        "secure": "false"
    },
    "Port": 82,
    "Check": {
        "Interval": "10s",
        "HTTP": "http://10.19.128.111:82/actuator/health",
        "Header": {}
    }
}

the sidecar health check will fail because " dial tcp 127.0.0.1:21001: connect: connection refused"。

Check the Envoy 21001 port I find it bind on machine ip(10.19.128.111) but not 127.0.0.1。

How to make Envoy bind on 127.0.0.1 ?

jkirschner-hashicorp commented 3 years ago

Hi @FunnyYish,

Can you share the command you used to launch the Envoy sidecar proxy? And is there any guide you were following to set this up?

It's possible you may want to pass a bind address for Envoy to use, such as with -bind-addr=“lan:{{ GetPrivateIP }}:21000”.

FunnyYish commented 3 years ago

I have solved it.Just remove the Address configuration item

jkirschner-hashicorp commented 3 years ago

Glad to hear it! Before I close this, do you have any suggestions of what we could improve to make this easier?

FunnyYish commented 3 years ago

Glad to hear it! Before I close this, do you have any suggestions of what we could improve to make this easier?

I think sidecar health check should always dial tcp 127.0.0.1 no matter whatever Address is

blake commented 3 years ago

I think sidecar health check should always dial tcp 127.0.0.1…

This isn't always a valid configuration. For example, when Consul is deployed on Kubernetes, the Consul agent and the application's sidecar proxy both run in different Linux network namespaces. The loopback address within the Consul agent's namespace is not the same loopback address that is in the sidecar proxy's namespace.

In order for the health check to succeed, the Address for the sidecar's health check needs be an IP address which is routable from the Consul agent. In Kubernetes, this is currently the Pod IP.

FunnyYish commented 3 years ago

I think sidecar health check should always dial tcp 127.0.0.1 no matter whatever Address is

Sorry,I was misrepresenting what I meant .As you said,this isn't always a valid configuration,I have got it. What I really want to say is sidecar health check should always dial tcp Address but not 127.0.0.1. In my problem,I set Address to "10.19.128.111" and envoy bind to "10.19.128.111" but sidecar health check dial to 127.0.0.1.

blake commented 3 years ago

@FunnyYish After looking at this again, I believe you can configure Consul to health check the non-loopback IP by specifying it under the proxy.local_service_address configuration option.

{
    "Connect": {
        "SidecarService": {
            "Proxy": {
                "local_service_address": "10.19.128.111",
                "Upstreams": [
                    {
                        "DestinationName": "server",
                        "LocalBindPort": 1234
                    }
                ]
            }
        }
    },
    "ID": "client-82",
    "Name": "client",
    "Tags": [],
    "Address": "10.19.128.111",
    "Meta": {
        "secure": "false"
    },
    "Port": 82,
    "Check": {
        "Interval": "10s",
        "HTTP": "http://10.19.128.111:82/actuator/health",
        "Header": {}
    }
}

The application will also need to bind to this IP so that Envoy can forward local connections to the application.

Out of curiosity, is there a reason that your application binds to the 10.19.128.111 address rather than only binding to the loopback address?

In Consul's Connect security model, we recommend that operators prevent non-Connect traffic to Services by configuring their services to only bind to the loopback IP, and forcing all external traffic to ingress through the sidecar proxy.

FunnyYish commented 3 years ago

Out of curiosity, is there a reason that your application binds to the 10.19.128.111 address rather than only binding to the loopback address?

Thank you for helping me understand Consul Service Mesh better. I'm actually looking for a way to make Spring Cloud Discovery compatible with Consul Mesh. Although the Service Mesh architecture is advanced, systems using the Spring Cloud framework have difficulty evolving to a Mesh architecture due to Spring Cloud Discovery. I asked Spring Cloud Consul's maintainers if they plan to support Consul Mesh but did not respond. So I started working on my own improvements to Spring Cloud Consul. The main ideas are as follows:

  1. Add configuration items to indicate whether to enable mesh and add configuration items related to upstream.
  2. After mesh is enabled, sidecar configuration items will be added during Spring Cloud service registration. The previous configuration I sent is the modified example.
  3. After mesh is enabled, service discovery returns 127.0.0.1 and ports configured in upstream instead of addressing from Consul.

Do you have any suggestions?