hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.07k stars 4.4k forks source link

Parameter enforce_hostnames in leaf-cert role of Intermediate CA from Vault PKI Secret Engine has been overwritten on consul upgrade from 1.12.4 to 1.12.5 #15149

Closed manojrkrish closed 1 year ago

manojrkrish commented 1 year ago

Overview of the issue

When Consul upgrades from 1.12.4 to 1.12.5, the setting enforce_hostnames in leaf-cert role from Vault PKI Secret Engine has been overwritten to yes always. This is causing our ingress gateway to fail as it relies on enforce_hostnames setting as no, please refer to the earlier issue on ingress gateway from here https://github.com/hashicorp/consul/issues/11092.

Seems this issue is happening whenever the consul leader node is switching either during upgrade or post upgrade.

Reproduction Steps

  1. Create a vault (1.9.8) and consul (1.12.4) cluster
  2. Enable PKI secret engine
  3. Setup Vault as a provider for consul connect ca
  4. Setup Root CA and generate Intermediate CA certificates
  5. Set enforce_hostnames to no in the leaf-cert role of intermediate ca backend for consul connect
  6. Upgrade consul to version 1.12.5
  7. Restart Consul leader agent and a new leader gets elected

Consul info for both Client and Server

Consul info ``` agent: check_monitors = 1 check_ttls = 1 checks = 6 services = 6 build: prerelease = revision = 778b5eaa version = 1.12.5 version_metadata = consul: acl = enabled bootstrap = false known_datacenters = 1 leader = false leader_addr = 10.197.99.16:8300 server = true raft: applied_index = 2596 commit_index = 2596 fsm_pending = 0 last_contact = 78.224803ms last_log_index = 2596 last_log_term = 9 last_snapshot_index = 0 last_snapshot_term = 0 latest_configuration = [{Suffrage:Voter ID:9f023c02-bb4b-b478-67ec-9637505bb4c4 Address:10.197.99.15:8300} {Suffrage:Voter ID:0c9ebdd8-d101-fde8-d795-ad5c80f80a98 Address:10.197.99.16:8300} {Suffrage:Voter ID:89167090-46f2-b333-a16c-9c7316894035 Address:10.197.99.17:8300}] latest_configuration_index = 0 num_peers = 2 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Follower term = 9 runtime: arch = amd64 cpu_count = 8 goroutines = 135 max_procs = 8 os = linux version = go1.18.1 serf_lan: coordinate_resets = 0 encrypted = true event_queue = 0 event_time = 9 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 15 members = 4 query_queue = 0 query_time = 1 serf_wan: coordinate_resets = 0 encrypted = true event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 6 members = 3 query_queue = 0 query_time = 1 ```
Vault info ``` Key Value --- ----- Seal Type shamir Initialized true Sealed false Total Shares 5 Threshold 3 Version 1.9.8 Storage Type raft Cluster Name vault-cluster-1902f7c6 Cluster ID d745742a-3bc3-90a7-3de2-19124d442241 HA Enabled true HA Cluster https://10.197.99.15:8201 HA Mode active Active Since 2022-10-25T09:43:35.547295353Z Raft Committed Index 253 Raft Applied Index 253 ```

Operating system and Environment details

uname -a ``` Linux cp-test-cluster-csl3 4.19.232-2.ph3 #1-photon SMP Sat Mar 12 02:19:30 UTC 2022 x86_64 GNU/Linux ```
consul members ``` Node Address Status Type Build Protocol DC Partition Segment cp-test-cluster-csl1 10.197.99.17:8301 alive server 1.12.5 2 cp-test-cluster default cp-test-cluster-csl2 10.197.99.16:8301 alive server 1.12.5 2 cp-test-cluster default cp-test-cluster-csl3 10.197.99.15:8301 alive server 1.12.5 2 cp-test-cluster default cp-test-cluster-gcl1 10.197.99.1:8301 alive client 1.12.5 2 cp-test-cluster default ```
consul operator raft list-peers ``` Node ID Address State Voter RaftProtocol cp-test-cluster-csl2 0c9ebdd8-d101-fde8-d795-ad5c80f80a98 10.197.99.16:8300 leader true 3 cp-test-cluster-csl3 9f023c02-bb4b-b478-67ec-9637505bb4c4 10.197.99.15:8300 follower true 3 cp-test-cluster-csl1 89167090-46f2-b333-a16c-9c7316894035 10.197.99.17:8300 follower true 3 ```
consul version ``` Consul v1.12.5 Revision 778b5eaa Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents) ```

Cause

This issue is seems to be happening after this change https://github.com/hashicorp/consul/pull/14516/files#diff-7bf30611a760296e2e6ffcd7fe955b0eb8317fe0a43c470b8b52d311af4a2515. With this change now the leaf-cert role has always been overwritten whenever setupIntermediatePKIPath() has been called and by default enforce_hostnames always been set to true causing the existing value we have set to get overwritten.

Log Fragments

Log Gist

jkirschner-hashicorp commented 1 year ago

Hi @manojrkrish,

We recently merged a fix to #11092 that will be included in the next set of Consul patch releases (for 1.12, 1.13, and 1.14).

I understand that modifying enforce_hostnames on the leaf cert role was a workaround you were using because of #11092.

Consul needs control over the configuration of some options on the leaf cert role, though enforce_hostnames is not one of them. My understanding is that Consul currently assumes it has full control over the configuration of the leaf cert role and can update it at-will.

I'm interested for your perspective on whether this issue (#15149) should be closed due to the closure of #11092, or whether you are separately requesting that we leave this open to consider whether to enable a Vault operator to modify some leaf role configuration despite Consul's need to control (parts of) it?

manojrkrish commented 1 year ago

Hi @jkirschner-hashicorp ,

Thanks for addressing #11092. This issue can be closed as its a workaround for earlier issue #11092.