hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.44k stars 4.43k forks source link

ACL is partially bypassed: Clients without anonymous token can join the cluster and read members in the cluster #21929

Open sharkzeeh opened 2 weeks ago

sharkzeeh commented 2 weeks ago

Overview of the Issue

Hello, sirs! I have been trying to configure a Consul Cluster with ACL deny policy, but I noticed some weird behaviour: anonymous clients with anonymous token (which is set by default) were able to join and view members of the cluster.

P.S. I have also read this similar archived issue and tried the workaround: I enabled the gossip encryption and indeed I was not able to join without setting encrypt: XXXXX in the client's config (see error below)

$ consul join 10.26.29.28
Error joining address '10.26.29.28': Unexpected response code: 500 (1 error occurred:
        * Failed to join 10.26.29.28:8301: Remote state is encrypted and encryption is not configured)
Failed to join any nodes.

After having set encrypt: XXXXX on the client side, I was able to join and check cluster members ~with anonymous token~ again.

NOTE: the following reproduction steps did not include gossip encrpyption


Reproduction Steps

  1. Create 1 server on a VM (10.26.29.28)
  2. Run consul acl bootstrap and set the bootstrap token as agent token
  3. Create 1 client on a different VM (10.26.86.30)
  4. Join the server with consul join
  5. Check members on the server and client side with anonymous token (Disclaimer: commands join and members on the client side will succeed)
# server
$ echo $CONSUL_HTTP_TOKEN

$ consul members
$ consul catalog nodes
$ consul info
Error querying agent: Unexpected response code: 403 (Permission denied: anonymous token lacks permission 'agent:read' on "consul-test-02". The anonymous token is used implicitly when a request does not specify a token.)

## set CONSUL TOKEN
$ export CONSUL_HTTP_TOKEN=<bootstrap-token>
$ consul catalog services
consul
$ consul catalog nodes
Node            ID        Address      DC
consul-test-02  4d108e18  10.26.29.28  m6
web-server-01   57a24a79  10.26.86.30  m6
$ consul members
Node            Address           Status  Type    Build   Protocol  DC  Partition  Segment
consul-test-02  10.26.29.28:8301  alive   server  1.19.2  2         m6  default    <all>
web-server-01   10.26.86.30:8301  alive   client  1.19.2  2         m6  default    <default>

# client
$ echo $CONSUL_HTTP_TOKEN

$ consul join 10.26.29.28
Successfully joined cluster by contacting 1 nodes.
$ consul members
Node            Address           Status  Type    Build   Protocol  DC  Partition  Segment
consul-test-02  10.26.29.28:8301  alive   server  1.19.2  2         m6  default    <all>
web-server-01   10.26.86.30:8301  alive   client  1.19.2  2         m6  default    <default>

NOTE: in the client log there are constant errors about token lacks permission, which is reasonable since client uses anonymous token

2024-11-06T21:28:07.413+0300 [WARN]  agent: Coordinate update blocked by ACLs: accessorID=""
2024-11-06T21:28:29.555+0300 [ERROR] agent.client: RPC failed to server: method=Coordinate.Update server=10.26.29.28:8300 error="rpc error making call: Permission denied: anonymous token lacks permission 'node:write' on \"web-server-01\". The anonymous token is used implicitly when a request does not specify a token."

I tried to register a service from the client agent (with anonymous token) and I got the error token lacks permission service:write (which is what I expected). See service.json

{
    "service": {
        "id": "web-server-01",
        "name": "frontend",
        "tags": ["v0.1", "dev"],
        "port": 80
   }
}
# client
$ consul catalog register service.json

2024-11-06T21:32:25.918+0300 [WARN]  agent: Coordinate update blocked by ACLs: accessorID=""
2024-11-06T21:32:41.184+0300 [ERROR] agent.client: RPC failed to server: method=Catalog.Register server=10.26.29.28:8300 error="rpc error making call: Permission denied: anonymous token lacks permission 'service:write' on \"frontend\". The anonymous token is used implicitly when a request does not specify a token."

$ consul catalog services
consul

$ export CONSUL_HTTP_TOKEN=<bootstrap-token>
$ consul catalog register service.json
Registered service frontend
$ consul catalog services
consul
frontend

So, as far as I see, there are only two commands consul join and consul members that work on the client agent with anonymous token - other commands would require the agent token for the client. It is strange since I cannot list members with anonymous token on the server side, for example.

Consul info for both Client and Server

Client info ```text agent: check_monitors = 0 check_ttls = 0 checks = 0 services = 0 build: prerelease = revision = 048f1936 version = 1.19.2 version_metadata = consul: acl = disabled known_servers = 1 server = false runtime: arch = amd64 cpu_count = 4 goroutines = 49 max_procs = 4 os = linux version = go1.22.5 serf_lan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 2 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 1674 members = 2 query_queue = 0 query_time = 1 ``` Client agent HCL config ```json { "server": false, "node_name": "web-server-01", "datacenter": "m6", "data_dir": "/home/user/consul/client_data" } ```
Server info ```sh $ consul info agent: check_monitors = 0 check_ttls = 0 checks = 0 services = 0 build: prerelease = revision = 048f1936 version = 1.19.2 version_metadata = consul: acl = enabled bootstrap = true known_datacenters = 1 leader = true leader_addr = 10.26.29.28:8300 server = true raft: applied_index = 3995 commit_index = 3995 fsm_pending = 0 last_contact = 0 last_log_index = 3995 last_log_term = 3 last_snapshot_index = 0 last_snapshot_term = 0 latest_configuration = [{Suffrage:Voter ID:4d108e18-1f60-1121-0466-bccb13e2dcc5 Address:10.26.29.28:8300}] latest_configuration_index = 0 num_peers = 0 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Leader term = 3 runtime: arch = amd64 cpu_count = 4 goroutines = 198 max_procs = 4 os = linux version = go1.22.5 serf_lan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 2 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 1674 members = 2 query_queue = 0 query_time = 1 serf_wan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 25 members = 1 query_queue = 0 query_time = 1 ``` Server agent HCL config ```json { "log_level": "INFO", "server": true, "ui_config": { "enabled": true }, "node_name": "consul-test-02", "bootstrap_expect": 1, "leave_on_terminate": true, "datacenter": "m6", "data_dir": "/home/user/consul/data", "client_addr": "0.0.0.0", "bind_addr": "10.26.29.28 ", "advertise_addr": "10.26.29.28 ", "enable_syslog": true, "acl": { "enabled": true, "default_policy": "deny", "down_policy": "extend-cache", "enable_token_persistence": true, "tokens": { "initial_management": "", "agent": "", } }, "performance": { "raft_multiplier": 1 } } ```

Operating system and Environment details

Consul version: 1.19.2

Ubuntu 22.04 LTS both for the client and server

Log Fragments

Client side logs after joining the server

2024-11-06T21:28:07.413+0300 [WARN]  agent: Coordinate update blocked by ACLs: accessorID=""
2024-11-06T21:28:29.555+0300 [ERROR] agent.client: RPC failed to server: method=Coordinate.Update server=10.26.29.28:8300 error="rpc error making call: Permission denied: anonymous token lacks permission 'node:write' on \"web-server-01\". The anonymous token is used implicitly when a request does not specify a token."

Client side logs after trying to register a service from the client

2024-11-06T21:32:25.918+0300 [WARN]  agent: Coordinate update blocked by ACLs: accessorID=""
2024-11-06T21:32:41.184+0300 [ERROR] agent.client: RPC failed to server: method=Catalog.Register server=10.26.29.28:8300 error="rpc error making call: Permission denied: anonymous token lacks permission 'service:write' on \"frontend\". The anonymous token is used implicitly when a request does not specify a token."
ruatag commented 2 weeks ago

After having set encrypt: XXXXX on the client side, I was able to join and check cluster members with anonymous token again. After encrypt key was provided the clients are not anonymous anymore mTLS is another level of protection as was mentioned in the similar issue

sharkzeeh commented 2 weeks ago

@ruatag, hello! Yes, you're totally right. However, my questoin is whether it is expected that one can join and view members of the cluster (with ACL deny policy) from the client agent (with or without gossip encryption enabled) without passing agent token.

Another note: in both cases (with or without gossip encryption) the command consul members requires agent token on the server side

ruatag commented 2 weeks ago

Can't reproduce your problem with 1.20.1 and also tried 1.19.1 consul join -http-addr=172.17.0.1:8500 consul2 Error joining address 'consul2': Unexpected response code: 403 (Permission denied: anonymous token lacks permission 'agent:write' on "consul1". The anonymous token is used implicitly when a request does not specify a token.) and consul members -http-addr=172.17.0.1:8500 produces empty resilt with exit code 2

Probably a policy/role is associated with anonymous token

sharkzeeh commented 2 weeks ago

Thank you for the feedback! Just to be sure: you did not enable gossip encryption or anything extra, correct?

The policies that I have

$ consul acl policy list
builtin/global-read-only:
   ID:           00000000-0000-0000-0000-000000000002
   Description:  A built-in policy that grants read-only access to all Consul features
   Datacenters:
global-management:
   ID:           00000000-0000-0000-0000-000000000001
   Description:  A built-in policy that grants read and write access to all Consul features
   Datacenters:

and here are the rules for the anonymous policy

$ consul acl policy read -id 00000000-0000-0000-0000-000000000002
ID:           00000000-0000-0000-0000-000000000002
Name:         builtin/global-read-only
Description:  A built-in policy that grants read-only access to all Consul features
Datacenters:
Rules:

acl = "read"
agent_prefix "" {
        policy = "read"
}
event_prefix "" {
        policy = "read"
}
identity_prefix "" {
        policy = "read"
        intentions = "read"
}
key_prefix "" {
        policy = "read"
}
keyring = "read"
node_prefix "" {
        policy = "read"
}
operator = "read"
mesh = "read"
peering = "read"
query_prefix "" {
        policy = "read"
}
service_prefix "" {
        policy = "read"
        intentions = "read"
}
session_prefix "" {
        policy = "read"
}
ruatag commented 2 weeks ago

Gossip encryption key gives different error Better to check policies/roles associated with the token consul acl token read -id=00000000-0000-0000-0000-000000000002

sharkzeeh commented 2 weeks ago

This is the output of the command

$ consul acl token read -id=00000000-0000-0000-0000-000000000002
Use the -accessor-id parameter to specify token by Accessor ID
AccessorID:       00000000-0000-0000-0000-000000000002
SecretID:         anonymous
Description:      Anonymous Token
Local:            false
Create Time:      2024-11-06 11:34:55.264928289 +0300 MSK

No policies associated with the token?

sharkzeeh commented 1 week ago

@ruatag, good morning, sir! Could you please check my previous comment?