hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.39k stars 4.43k forks source link

Connect must be enabled in order to use this endpoint #6890

Open celesteking opened 4 years ago

celesteking commented 4 years ago

Getting this continuously on client after following the tutorial and trying to switch into production mode:

    2019/12/05 20:32:34 [ERR] consul: "ConnectCA.Roots" RPC failed to server 192.168.112.3:8300: rpc error making call: Connect must be enabled in order to use this endpoint
    2019/12/05 20:32:34 [ERR] consul: "ConnectCA.Roots" RPC failed to server 192.168.112.4:8300: rpc error making call: rpc error making call: Connect must be enabled in order to use this endpoint
    2019/12/05 20:32:34 [ERR] consul: "ConnectCA.Roots" RPC failed to server 192.168.112.3:8300: rpc error making call: Connect must be enabled in order to use this endpoint
    2019/12/05 20:32:34 [ERR] consul: "ConnectCA.Roots" RPC failed to server 192.168.112.4:8300: rpc error making call: rpc error making call: Connect must be enabled in order to use this endpoint

This endlessly appearing entirely cryptic message doesn't help at all. I'm trying start fresh with client config, it's been connected to servers and servers are in sync. I've deleted /consul/data to get rid of stale crap.

consul leave on client should really delete all stale data, if any was left from previous operation. This is client, not server, it should self-destruct and let me start anew.

Client info ``` agent: check_monitors = 0 check_ttls = 0 checks = 0 services = 1 build: prerelease = revision = 1200f25e version = 1.6.2 consul: acl = disabled known_servers = 2 server = false runtime: arch = amd64 cpu_count = 8 goroutines = 95 max_procs = 8 os = linux version = go1.12.13 serf_lan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 2 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 48 members = 3 query_queue = 0 query_time = 1 $ consul members Node Address Status Type Build Protocol DC Segment srv-1 192.168.112.3:8301 alive server 1.6.2 2 chi2 srv-2 192.168.112.4:8301 alive server 1.6.2 2 chi2 client1.hk.local 192.168.112.6:8301 alive client 1.6.2 2 chi2 ```
Server info ``` agent: check_monitors = 0 check_ttls = 0 checks = 0 services = 0 build: prerelease = revision = 1200f25e version = 1.6.2 consul: acl = disabled bootstrap = false known_datacenters = 1 leader = true leader_addr = 192.168.112.3:8300 server = true raft: applied_index = 201 commit_index = 201 fsm_pending = 0 last_contact = 0 last_log_index = 201 last_log_term = 2 last_snapshot_index = 0 last_snapshot_term = 0 latest_configuration = [{Suffrage:Voter ID:5b9b3670-3d34-b352-f3bc-91dc904ac694 Address:192.168.112.3:8300} {Suffrage:Voter ID:cd2b23a8-db46-1a0d-07b9-9361475f8030 Address:192.168.112.4:8300}] latest_configuration_index = 1 num_peers = 1 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Leader term = 2 runtime: arch = amd64 cpu_count = 8 goroutines = 97 max_procs = 8 os = linux version = go1.12.13 serf_lan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 2 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 48 members = 3 query_queue = 0 query_time = 1 serf_wan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 18 members = 2 query_queue = 0 query_time = 1 ```
banks commented 4 years ago

@celesteking The error message is referring to this setting which needs to be enabled on servers: https://www.consul.io/docs/agent/options.html#connect_enabled for Consul Connect (our service mesh feature) to work.

Can I ask which tutorial you were following?

I think the issue might be that the "getting started" guide is purposefully simplified and so uses an agent in -dev mode which preconfigures several things including enabling Connect.

If you can tell us a bit more about your process - which guides did you follow, what did you try next, we can maybe debug and make sure that path is clearer for others in the future.

More details about Connect feature can be found here: https://www.consul.io/docs/connect/configuration.html

banks commented 4 years ago

Oh there is also this guide: https://learn.hashicorp.com/consul/developer-mesh/connect-production which walks through the steps/prerequisites that get you from a kick-the-tyres demo mode to a real production setup.

celesteking commented 4 years ago

I have followed the dev tutorial, now I'm onto prod tutorial. https://learn.hashicorp.com/consul/getting-started/join

I'm able to reproduce the issue by starting up client with the following leftover config from dev tutorial:

cat /consul/config/web.json 
{"service":
  {"name": "socat",
   "port": 8181,
 "connect": { "sidecar_service": {} }
  }
}

After I remove the service with consul services deregister -id=socat, it continues spitting those messages.

What is Connect and what is a mesh? The only thing that's used above is sidecar proxy feature. Just don't tell me you're using 3 different terms to describe 1 thing.

We don't need sidecar proxy feature, it's irrelevant for our setup.

celesteking commented 4 years ago

Another thing I've noticed is that when you're doing things that supposed to fail, they don't fail, like deregistering a nonexistent service or consul kv del doesntexist.

You should do what redis does -- return an error (0) or ENOENT, but don't return SUCCESS.

celesteking commented 4 years ago

Also, the timing is wrong in all these docs pages. I can't be THAT stupid, but it takes me 10 minutes only to read the text and understand the pics on https://learn.hashicorp.com/consul/getting-started/services , not mentioning the time needed to actually mess around the commands and their output.

Take an average Polish or whatever Finland person you might find nearby, make sure he knows no redis or whatever, and ask him to follow the tutorial. Measure how long it really takes.

Another thing, the text Because there is no web service running, you will pretend to be the web service by talking to its proxy on the port that we specified (9191). is complete nonsense. For me to act as web-service, I have to manually provide a listening service via socat or nc -vlp 9191. But I'm not doing that. Instead, I'm connecting to the service, which means I'm acting as a client, not a server. Also all this sidecar chitchat could've been explained much better in modern terms, which is a VPN , a TUNNEL. Every kid round the block knows what a VPN or secure tunnel is. You're essentially establishing a secure tunnel in order to tunnel the data through. What's up with that "sidecar" terminology?... Getting back to the 9191 service above, why would you even propose user using it? Can't that tunnel be made unidirectional, not bidirectional, so that the client log wouldn't spit it can't dial 9191 or 12001 (I don't remember clearly what was the message, but there WAS such a repeating message and it was confusing me a lot). Just don't tell me all this sidecar thing expects services to be running on both ends and can't be unidirectional...

Anyway, just my thoughts and I've only started... Still, all this is definitely better than shitshow most opensource (AKA student dorm) projects provide in regards to documentation. I'd say you've gone far and beyond.

celesteking commented 4 years ago

I'm still getting original error even after following docs: consul connect ca get-config. And in client log:

2019/12/06 10:45:14 [ERR] consul: "ConnectCA.ConfigurationGet" RPC failed to server 192.168.112.4:8300: rpc error making call: Connect must be enabled in order to use this endpoint

servers were fed:

$ cat /consul/config/server.hcl 
connect {
  enabled = true
}
$ consul reload

Still, no dice.

Also, it seems like there's no way to view live server config, there's no such option. I'm not talking about cating the config files, but about how server views the config, internally, live, with defaults, etc.

Folcky commented 4 years ago

consul connect ca get-config

Hi there! I have just tried to enable connect on the leader node to enable also and got success. The guess is that you have to enable connect on every node.

Sorry, but there is a second evening with consul here )

vishaltelangre commented 4 years ago

If you are running server agents in non -dev mode using Docker (https://github.com/docker-library/docs/tree/master/consul) then you can start the agents using initial server configuration using CONSUL_LOCAL_CONFIG environment variable to feed in an initial configuration.

For example:

$ docker run \
      --name my-server1 \
      -e CONSUL_BIND_INTERFACE=eth0 \
      -e 'CONSUL_LOCAL_CONFIG={"connect": {"enabled": true}}' \
      consul agent \
          -server \
          -node my-server1 \
          -data-dir /tmp/consul \
          -join 172.17.0.7 \
          -config-dir /consul/config

Above command would start a consul server agent in a Docker container by joining in to an existing cluster by specifying IP (172.17.0.7) of a node in that cluster.

The contents of CONSUL_LOCAL_CONFIG environment variable gets mounted to /consul/config/local.json.

All server agents can be restarted likewise one by one. Any server that boots up with this configuration and becomes a cluster leader will enable connect and will bootstrap the built-in Certificate Authority (CA).

You can verify that connect is enabled by running following command

$ consul connect ca get-config

It should return CA config JSON.

Peter2121 commented 2 years ago

After upgrade to v1.11.3, I hit this problem too, the list of hosts for a service in Consul UI is not shown correctly. Is there any workaround other that enabling 'connect'?