Open bnycohoe opened 1 year ago
Can confirm reproducibility of this issue exists. Reproduced this on consul v1.13.2+ent | consul-k8s v0.49.0 -> v1.13.9+ent | v0.49.8 -> v1.14.7 | v1.0.5.
Observation of server-acl-init
logs will show the policy being updated as:
# server-acl-init logs
2023-09-25T20:38:04.308Z [INFO] Policy "anonymous-token-policy" already exists, updating
2023-09-25T20:38:04.401Z [INFO] Success: creating anonymous token policy - PUT /v1/acl/policy
2023-09-25T20:38:04.403Z [INFO] Success: updating anonymous token with policy
Community Note
Overview of the Issue
An upgrade of Consul-k8s 0.26.0 to 1.1.6 in my primary datacenter caused the anonymous token to lose the custom policy we had linked to it (in our case called
Anonymous
). After the upgrade, the only token policy linked to the anonymous token was theanonymous-token-policy
created by theserver-acl-init
process. This caused an outage for certain customers of ours because our tooling relies on certain anonymous privileges for KV reads that we had granted to the anonymous token via our policy.Upgrades to a deployment that already contain an
anonymous-token-policy
will skip altering the token policies as of consul-k8s 1.1.4 thanks to an existence check added in https://github.com/hashicorp/consul-k8s/pull/2790. Based on the code in https://github.com/hashicorp/consul-k8s/blob/2feff9f2cb36f4ee818874c4d657ede2acbc074a/control-plane/subcommand/server-acl-init/anonymous_token.go#L49 any policies linked to the Anonymous Token will not be persisted through an upgrade, replaced only with the managed policy if the managed token policy does not already exist.I believe this is undesirable behavior because user configuration data is thrown away (the linked policies they had configured prior to upgrade). Note that the policies themselves will still exist, and re-linking them is trival to accomplish, but it requires manual intervention.
Reproduction Steps
Upgrade consul-k8s from a very old version (something <~0.49.0) to a new version (such as >=1.1.6) in a primary datacenter.
Logs
Logs from the run of my
server-acl-init
are not available.Expected behavior
Any user-defined token policies that are linked to well-known tokens (specifically the anonymous token) should remain linked through an upgrade.
Environment details
Old Consul-K8s version: 0.26.0 New Consul-K8s version: 1.1.6 Kubernetes version: 1.27.3 Consul Server version: 1.15.2-ent Relevant values:
Additional Context
In the 1.2.1 breaking changes there is mention that all policies managed by consul-k8s will now be updated on upgrade. This is not true after the implementation of https://github.com/hashicorp/consul-k8s/pull/2790. An existing
anonymous-token-policy
will not be updated on upgrade. Notes in the documentation should reflect this.Also the GH-2790 improvement notes in the changelog do not appear in any 1.1.x minor release. They exist only for 1.0.10 and 1.2.1, neither of which are minor versions that would have applied to me. 1.1.4 appears to be the first version with the backported change. Knowing there was a change in behavior with 1.1.4+ would have likely led to quicker resolution of my original problem.