Design for polling-first architecture

squaremo commented 1 year ago

The initiative https://www.notion.so/weaveworks/Simplify-Pipelines-to-improve-the-user-experience-and-enable-scalability-28eef20db2ea4e9bb72a596b8a99a899#c9a35a2c55114d0e802a9fb49e8215cd describes several problems (problems 1 and 5) that relying on webhook notifications give rise to:

you have to create notification resources in all downstream clusters (that you gate promotions on), which means larger templates, more room for mistakes, more permissions needed, etc.
the implementation uses information passed in the webhook URL and payload to make decisions, which makes it vulnerable to injection attacks;
if the server misses a notification, it now has the wrong state

All this adds up to: we need to implement polling, and treat webhook invocations as a trigger to poll the resource in question immediately. I think this is complicated enough that it's worth writing out a design.

There's previous work in this direction: https://github.com/weaveworks/pipeline-controller/issues/179 and PR https://github.com/weaveworks/pipeline-controller/pull/180. To recap here, this implementation

creates an HelmRelease watcher for every cluster
writes labels to each HelmRelease used in a pipeline
dispatches HelmRelease updates from the watcher by examining the labels

I think this approach is flawed on these counts:

it needs to be able to write to every downstream cluster
it relies on labels on the downstream objects for dispatch, and these could be changed
(it looks to me like) it unconditionally watches every HelmRelease in every cluster, which seems it could be a lot of unnecessary work

Instead, I suggest we should start with the pseudo-algorithm:

for each Pipeline
 for each Environment
   for each Target
     get a downstream cluster client if necessary
     retrieve the app status
  for each Environment[1:]
    calculate whether a promotion is indicated, and if so, invoke it

... then consider optimisations from there. For example, cluster-api has a client cache which could be used to make HelmRelease (and other "app" object) lookups less costly.

Part of the design should be a mechanism for webhooks to trigger polls, so that it's still possible to make the system more responsive with notifications.

It may also be possible to address problem 7 from the initiative, since polling will have more scope to calculate success from the whole status, and not just what it's told in a notification.

luizbafilho commented 1 year ago

it relies on labels on the downstream objects for dispatch, and these could be changed

The reconcile will set the labels if those change. Relying on labels is just a common practice. You can break a service by changing labels, for instance. So, I'm not sure that's a real flaw in k8s world.

(it looks to me like) it unconditionally watches every HelmRelease in every cluster, which seems it could be a lot of unnecessary work

To avoid this, it only watches for HelmReleases with a label indicating pipeline enable.

cluster-api has a client cache

Could you link that? I'd like to take a look.

squaremo commented 1 year ago

To avoid [watching every HelmRelease], it only watches for HelmReleases with a label indicating pipeline enable.

This looks like "watch every HelmRelease" to me: https://github.com/weaveworks/pipeline-controller/pull/180/files#diff-29ab67ad2861533ba7cfc10ed9d527f12a57b27cae23c566d373d82027c8f4c6R84

It's kind of moot, though, because using a cache for each cluster as I suggest would result in at least namespace-scoped watches too. I think it's a bigger problem that it unconditionally starts watches for every cluster it can see, rather than just those involved in Pipelines.

The reconcile will set the labels if those change.

The pipeline reconciler sets them, and (looks to me) that it'll only do it every interval. If you update the labels on the HelmRelease yourself on the other hand, I suspect it'll fire the watch and the informer will act on it straight away (with bogus labels). Or if not that, there's a window of however many minutes where an update could happen with consequences. (Side note -- I'm not sure the controller removes labels when a pipelines changes.).

More generally, if we can avoid dispatching on information from a downstream cluster, that can be changed by users, we should -- because it represents a possible exploit.

Relying on labels is just a common practice. You can break a service by changing labels, for instance. So, I'm not sure that's a real flaw in k8s world.

To this wider point: labels are certainly used to link dependent objects in a decoupled way, under control of the user (Service falls into this category). Here and there they are used by controllers when there's no other way; e.g., if an ownerRef would be inappropriate. But, here they are an unreliable mechanism, and I don't think we need them -- we can use indexing.

Could you link [to cluster client cache]? I'd like to take a look.

Here it is: https://github.com/kubernetes-sigs/cluster-api/blob/main/controllers/remote/cluster_cache_tracker.go

It is a public API, but I don't think it's a reliably public API, since it's deep down in a controller package. Also: it doesn't export a method for removing a downstream cluster cache. So it's not ideal; on the other hand, it'd be pretty simple to adapt.

luizbafilho commented 1 year ago

This looks like "watch every HelmRelease" to me: https://github.com/weaveworks/pipeline-controller/pull/180/files#diff-29ab67ad2861533ba7cfc10ed9d527f12a57b27cae23c566d373d82027c8f4c6R84

Indeed, but my intention with that NewFilteredDynamicSharedInformerFactory was to couple with https://github.com/weaveworks/pipeline-controller/pull/180/files#diff-9061f5964b74be27c0df6d91b248ea7f8ce6d42039b1b3fc18f7041273efc5cbR30

I think it's a bigger problem that it unconditionally starts watches for every cluster it can see, rather than just those involved in Pipelines.

I think we can create a set of all used targets by the pipelines and only listen to them.

But, here they are an unreliable mechanism, and I don't think we need them -- we can use indexing.

The goal of the labels is to help the controller to identify the related pipeline. Not sure how that index would work, but would indeed be more reliable.

squaremo commented 1 year ago

What I suggest we do is write out two designs: the design as embodied by #180, and the design starting from the naive approach I outlined at the top. In both cases, we should address problems we've discussed above, and can otherwise anticipate, and incorporate those into the design. Considering viable alternatives will give us more confidence in whatever we choose to do.

This can be an RFC in weaveworks/weave-gitops-private, since that seems to be where they go. Would you be willing to start us off with a draft @luizbafilho?

squaremo commented 1 year ago

Not sure how that index would work,

As a sketch: when examining a Pipeline object, make an entry in the index for the mapping {cluster name, app name} -> pipeline name of each target (using a sentinel for the mgmt cluster -- a zero value would do, since that's not a valid cluster name). Whenever an update comes in for an app {cluster name, pipeline name}, look it up in the index to find the pipeline name, and queue that pipeline to be evaluated.

This differs a little from the standard pattern of indexing dependent objects in controller-runtime, because the objects are in general in another cluster, and each of those has its own watcher. So it needs a bit of careful arranging so that the individual cluster watchers can read from the index.

luizbafilho commented 1 year ago

doc link: https://github.com/weaveworks/weave-gitops-private/pull/128

weaveworks / pipeline-controller

Design for polling-first architecture #191