Handling of ServerSpec collision errors

ripta commented 1 year ago

What is the issue?

This is in relation to a feature added in #8076 which also started validating spec.podSelector to ensure that the field on new Server doesn't cause selector overlap with pod selectors on existing Server resources.

Unfortunately, this check trips up when users rename their Server resource. Because renames look like a create and delete pair of operations to Kubernetes, this causes the resource with the old name (that still exists in the cluster) and the resource with the new name (being submitted for creation) to have overlapping selectors.

Assuming I haven't missed any special flags, the problem is also that some tooling don't prune until after creation succeeds, while creation fails because the old resource hasn't been pruned yet:

helm doesn't delete resources (deleteResource call on line 459) until after creates (createResource call on line 412) and updates (updateResource call on on line 427) succeed.
kubectl apply --prune doesn't prune (masked as the call to PostProcessorFn on line 477) until after all objects are applied (applyOneObject).

How can it be reproduced?

Create a new Server resource, e.g.:

apiVersion: policy.linkerd.io/v1beta1
kind: Server
metadata:
  name: http
  labels:
    app.kubernetes.io/name: web
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: web
  port: http
  proxyProtocol: HTTP/1

Apply the resource. Change the resource name (metadata.name field) to something new, e.g., web, and you'll get an error.

Logs, error output, etc

The error is:

could not create object: admission webhook "linkerd-policy-validator.linkerd.io" denied the request: identical server spec already exists

output of `linkerd check -o short`

N/A

Environment

linkerd 2.12.3

Possible solution

We currently have to manually delete the resource with the old name before doing any resource creation / update / apply. It was also not always obvious to users which resources are colliding, but #10187 seems to have alleviated that.

With GitOps-based controls, where users might not have write access to production systems, this sometimes means it requires intervention from cluster operators. With dozens of clusters and hundreds of microservices, this doesn't scale well. We could attempt to build something to handle it automatically, but deleting before creating does mean there could be downtime, because there are no Server resources that target the set of pods.

Another possible solution is to add a flag to allow the cluster administrator to disable pod selector overlap validation. A big question I haven't taken into account here (and don't know the answer to) is what the repercussions of overlapping selectors are, e.g., whether selector overlaps cause undefined behavior on the proxy.

Additional context

No response

Would you like to work on fixing this bug?

None

jeremychase commented 1 year ago

@ripta Thank you for the detailed report!

The admission webhook blocking the above resource renaming will be removed or relaxed as we continue to improve Status fields on Linkerd policy resources. We have work in flight to address updating Statuses and we are keeping this issue open to track resolving the bug you have identified.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

linkerd / linkerd2