linkerd / linkerd2

Ultralight, security-first service mesh for Kubernetes. Main repo for Linkerd 2.x.
https://linkerd.io
Apache License 2.0
10.49k stars 1.27k forks source link

Reindex outbound policy backends when a service changes #12635

Closed adleong closed 1 month ago

adleong commented 1 month ago

If an HTTPRoute references a backend service that does not exist, the policy controller synthesizes a FailureInjector in the outbound policy so that requests to that backend will fail with a 500 status code. However, we do not update the policy when backend services are created or deleted, which can result in an outbound policy that synthesizes 500s for backends, even if the backend currently exists (or vice versa).

This is often papered over because when a backend service is created or deleted, this will trigger the HTTPRoute's ResolvedRef status condition to change which will cause a reindex of the HTTPRotue and a recomputation of the backends. However, depending on the specific order that these updates are processed, the outbound policy can still be left with the incorrect backend state.

In order to be able to update the backend of an outbound policy when backend services are created or deleted, we change the way these backends are represented in the index. Previously, we had represented backends which were services that did not exist as Backend::Invalid. However, this discards the necessary backend information necessary to recreate the backend if the service is created. Instead, we update this to represent these backends as a Backend::Service but with a new field exists set to false. This allows us to update this field as backend services are created or deleted.