ImageUpdateAutomation in different namespace than ImagePolicy

TrueBrain commented 3 years ago

I have the following setup:

Flux2 is running in namespace flux-system. Prometheus is running in namespace monitoring.

I am trying to get image automation to work with as little duplication as possible.

So, I defined ImagePolicy (and ImageRepository) close to Prometheus. This means, because I use kustomize, that it is forced in the monitoring namespace. Next, I defined ImageUpdateAutomation in the flux-system namespace, as it wants access to the GitRepository which is already defined there.

Now my problem: this setup doesn't work when I try # {"$imagepolicy": "monitoring:prometheus:tag"}.

From what I understand, and what does work:

either ImagePolicy needs to be in the flux-system namespace (which is far from ideal from a Infrastructure-as-Code point of view)
or I need to move ImageUpdateAutomation into monitoring namespace. I don't really have an issue with that, except for the minor code duplication, but the gitRepositoryRef doesn't accept namespace. That would mean I also have to duplicate GitRepository, which would be very prune to errors.

Now I am not sure that what I am trying here is how this is meant to be used, or that I am overlooking some setting somewhere. So if you have any pointers for me how I should be doing this, or if there is an option I can use to make this work, that would be very welcome :)

Thank you!

stefanprodan commented 3 years ago

I do think we should use make use of sourceRef in the same way we do in kustomize-controller and helm-controller APIs e.g. CrossNamespaceSourceReference, where you can refer to a GitRepository from a different namespace.

PS. This is my personal view, I think @squaremo would disagree.

squaremo commented 3 years ago

either ImagePolicy needs to be in the flux-system namespace (which is far from ideal from a Infrastructure-as-Code point of view)

Can you explain why this is less than ideal?

Are your sync definitions (Kustomization etc.) in flux-system with the GitRepository objects? I think it's reasonable -- preferable, even -- to run automation in the same place as syncs. This leads me to the alternative quoted above: the ImageRepository and ImagePolicy objects are for automation, so get segregated into a system namespace, i.e., flux-system.

You may end up duplicating image repository credentials, if you're using imagePullSecrets for your deployments. On the other hand, this could be a convenience, if you want to define one set of credentials that can scan all image repos, but keep the credentials in "user" namespaces specific to those namespaces.

I do think we should use make use of sourceRef in the same way we do in kustomize-controller and helm-controller APIs

I am reconciled (ha) to cross-namespace refs, in light of the model proposed in https://github.com/fluxcd/flux2/pull/582 which at least puts some restriction on them by default. But I don't think they are the right way to arrange things, if the automation can just go alongside the git repository object.

TrueBrain commented 3 years ago

Can you explain why this is less than ideal?

Are your sync definitions (Kustomization etc.) in flux-system with the GitRepository objects? I think it's reasonable -- preferable, even -- to run automation in the same place as syncs. This leads me to the alternative quoted above: the ImageRepository and ImagePolicy objects are for automation, so get segregated into a system namespace, i.e., flux-system.

Yes, I could have used a bit more words to explain my point, sorry about that :)

From a functional perspective, I fully get what you mean. The problem for me is when you look at this from a Infrastructure as Code perspective. To give an example:

In kustomization.yaml, I would need to define an image like this:

images:
- name: prom/prometheus
  newTag: v2.21.0  # {"$imagepolicy": "flux-system:prometheus:tag"}

To extend on the above example, this will be done in the monitoring namespace. Now if I would like to place ImagePolicy in the flux-system namespace, I cannot do this next to any of these files due to the way how Kustomize works (you cannot change the namespace for a specific item if you set the namespace in kustomization.yaml, which is the suggested thing to do). So I will have to make a file "far away" (from a coding perspective) from the definition of the image, to set the ImagePolicy, for example in another kustomization.yaml to uses the flux-system namespace.

This distance between defining two things that are related to Prometheus, is what I consider not ideal. Distance in code mostly results in problems, sooner or later :) I rather have things defined closely together, especially as they both have to reference the same image-name. I know how developers are .. they change 1 reference, forget to grep for another, and before you know it you have an ImagePolicy on an image that no longer is used :) By having the code close to each other, you often avoid such mistakes.

But to put this in slightly different words: basically, I am pivoting the code around Prometheus, where I put Prometheus central and want everything related to it as closely as I can to it. You pivot around flux, where you put Flux central and want everything related to it as closely as you can :) I am not sure one is better than the other .. mostly just different :) You consider ImagePolicy part of the automation, I consider ImagePolicy part of the definition of the application :) Same thing, different view :D (at least, that is what I think .. I am completely open to be convinced otherwise :D).

squaremo commented 3 years ago

I see -- thank you for taking the time to follow up here, it's really helpful to get people's thought-through perspectives.

I'll attempt to paraphrase your position:

a. you would like to define ImagePolicy for a workload alongside that workload b. to include them in the same kustomization ("alongside"), they must (or near enough to must) be in the same namespace c. in general, if the policy (file) relates to a marker in a (workload) file, you want those files to be close together, to lower the risk of accidentally un-relating them

I think that last is a good principle. I sometimes call this situation a "trapeze act" -- when two strings far apart have to meet, for the system to work correctly. In this case, it's the namespace of the policy and the namespace mentioned in the marker in the workload file. It is necessary to fully qualify the markers so that updates are unambiguous, so there will always be some element of trapeze act.

Having the code close doesn't mean that the objects have to be co-located at runtime. It's b.) (and a reluctance to add another layer of kustomization), that entails the latter. For example, you could group together the workloads and give them a namespace all at once with one kustomization, then include the automation objects as well in a kustomization that doesn't patch the namespace.

There are other arguments that don't depend on the quirks of kustomize though: perhaps you want automation to be accessible to users, so they can define their own policies, but not the git repository (or the secret that the git repository refers to). The easiest way to arrange that is to put the git repository and its secret in another namespace -- but then how do you use it from the automation? Adding cross-namespace references is something of a drastic remedy though, since it means anyone can use anyone else's git repository. This can be mitigated by insisting that automations participate in RBAC, so you can at least limit which users can use which git repositories, ideally from a starting point of least privilege. This is the essence of the model proposed in fluxcd/flux2#582, which would have to be extended to cover automations too. I'm not totally against cross-namespace references, but I am against them without any restriction by RBAC.

There's alternatives to the current design that are less drastic. What if automations could give the namespace in which they operated, for example. Then you would house the policies with the workloads, and the automation with the git repository. That might bring up other dangers of course, so would need plenty of thinking through.

TrueBrain commented 3 years ago

Having the code close doesn't mean that the objects have to be co-located at runtime. It's b.) (and a reluctance to add another layer of kustomization), that entails the latter. For example, you could group together the workloads and give them a namespace all at once with one kustomization, then include the automation objects as well in a kustomization that doesn't patch the namespace.

I did consider this solution; the problem is, we use namespace to have different identical copies of the same thing running in the same cluster (for example, we have Traefik running twice in the cluster, ones for traffic on 1 VLAN, once for traffic of another VLAN; they are both created from exactly the same base Kustomize template). So this approach means you need two overlays to make that happen, which turns out to works very clumsy. But yeah, it is a possibility for sure :) Honestly, it feels like we are more battling Kustomize here than image-automation-controller, but okay.

There's alternatives to the current design that are less drastic. What if automations could give the namespace in which they operated, for example. Then you would house the policies with the workloads, and the automation with the git repository. That might bring up other dangers of course, so would need plenty of thinking through.

Honestly, I initially assumed it worked like this :) I assumed it worked something like a Prometheus controller, that uses the annotation tags over all (defined) namespaces to find what to scrape. Initially I assumed I did something wrong and I wasn't passing along the --all-namespaces parameter. But that does something else ;)

After some more thinking about this, and something that might help you out understanding where I am coming from: Personally, I am totally fine if ImageRepository and ImageUpdateAutomation has to be defined in flux-system (basically defining what images are allowed to be automated). It is the ImagePolicy that is the main issue for me, as it defined a version range an image can have. I see this more as an annotation on a container (which I guess cannot be a real annotation because of the many places you can define an image), than the execution of the automation. This is why the first thing I tried was adding namespace to the imageRepositoryRef in ImagePolicy :) So yeah, if ImageUpdateAutomation could define which namespaces it is allowed to check, that would already be more than sufficient for my use-case.

Anyway, thank you for thinking along with me, and being interested in my use-case; that is really much appreciated :) And I fully understand (and support) your stance on not allowing cross-namespace references without good (security) restrictions in place :)

davidkarlsen commented 3 years ago

I'm coming from a somewhat similar situation. I'd like to keep update-stuff (i.e. ImageRepository+ImagePolicy in same ns as workload to keep them logically. Flux/sourcecontroller would maybe be capable of knowing how/from where the workload was deployed, and hence backtrack it to gitrepo, in order to know what files to update and where?

ViBiOh commented 3 years ago

I encounter the same need here.

We have a repository describing all our infrastructure, call it infra.

In infra, there is one folder per namespace, and one team responsible for each namespace. Ownership is clearly defined by namespace (I own the Continuous Delivery with flux-system, another team own the namespace api, another web, etc.).

I provide the tool for GitOps in my namespace, flux-system. With flux, I'm able to apply manifests in teams' namespace, but teams can't request an ImageUpdateAutomation from their namespace to flux-system, it's unequal I think.

The ownership is defined by namespace, and in Git, by folder. We don't want to create objects in a git folder into another namespace in Kubernetes. We have a tool that check that, we have CODEOWNERS of GitHub that requests reviews for appropriate people, etc.

Also, I don't want to replicate a GitRepository (with ssh-credentials) in each namespace. With SealedSecrets, I have to re-cipher credentials for each namespace during rotation and also when new namespace are added.

In the # {"$imagepolicy": "flux-system:podinfo:tag"} syntax, the reference to the namespace makes me think that the "ImageUpdateAutomation" will parse every file, and when it encounters such tag, it will check the namespace/podPolicy mentioned.

ilya-git commented 3 years ago

I have encountered the same problem and wondering if it would perhaps be a good idea to have a scanning namespace option for ImageUpdateAutomation? That way all of them can live in flux-system, but each of them will point to a namespace with ImagePolicies that it will scan?

That way all individual payloads will continue to live in their own namespaces with everything required for it, but one ImageUpdateAutomation per namespace can be kept in flux-system together with GitRepository

squaremo commented 3 years ago

I've been preparing a recommendation for image API v1beta1, and as part of thinking that through, I took the solutions proposed above and tried to match them with the requirements people have expressed:

Let the gitRepositoryRef (/sourceRef) refer to other namespaces

This has been the solution for similar inconveniences in other GOTK APIs. Its main downside is that it breaks namespacing. In effect anyone would be able to run automation on any git repository defined in the cluster (provided you know about it). There is less danger than it might seem, since policy markers are namespaced. However, anything that sidesteps namespacing should be very carefully considered, and while automation does not account for RBAC, doubly so.

This is mitigated by the proposal in https://github.com/fluxcd/flux2/pull/582, which at least makes cross-namespace references subject to RBAC control (assuming impersonation were extended to automation objects).

Requirement	Solution with cross-namespace refs
Keep policies separate to git repository	Put policies and automation object in `app`, and refer to git repository in `flux-system`
Have several sets of policies	Put automation objects and policies in each namespace, and refer to git repository in `flux-system`
Give people control over the automation and the policies	As for "Have several sets of policies"
Give people control over the policies but not the automation	Put the automation object in the app namespace with the policies, and use RBAC to grant access only to policies

Give a target namespace in the image automation object

This is a tempting proposal because it makes it possible to put the automation object alongside the git repository (and secret), while keeping policy objects next to the workloads.

The justification is this: if you can create an image automation object in the same namespace as the git repository, you have demonstrated that you have permission to access that git repository. This still triggers the "cross-namespace" concern as above, though slightly differently: since you are targeting a namespace, it is like granting a capability that is yours to give, rather than claiming a capability.

Requirement	Solution with target namespaces
Keep policies separate to git repository	Put the automation object and git repo in `flux-system`, and policies in `app`; target `app`
Have several sets of policies	Put the git repository in `flux-system` and have several automations there targeting different namespaces with policies
Give people control over the automation and the policies	Might have to duplicate git repository into namespaces
Give people control over the policies but not the automation	As for "Have several sets of policies"

Legion2 commented 3 years ago

I think some clarification on the second solution of @squaremo "Give a target namespace in the image automation object" is required. I have explained my current problem in https://github.com/fluxcd/flux2/discussions/1017#discussioncomment-541543.

I think for the ImageUpdateAutomation it makes sense to include it in the namespace of the GitRepository because they are closely related. But the ImagePolicy and the ImageRepository should not be restricted to the namespace of the ImageUpdateAutomation, because they are closely related to the manifests using the version as explained by @TrueBrain and are also directly referenced by the automation markers in the manifests with namespace as noted by @ViBiOh.

With the path feature of the image automation it should be easy to restrict image automation based on repository structures instead of namespaces. So the Image automation controller would lookup ImagePolicy in the target namespace defined by the ImageUpdateAutomation, according to the proposal of @squaremo. But here we need a clarification, if a ImageUpdateAutomation could target multiple or all namespaces (if there is only one team in the cluster), or if one ImageUpdateAutomation can only target one namespace? In a multi tenant setup, the target namespaces are additionally be restricted by RBAC.

This way also the error handling can be improved if the ImageUpdateAutomation encounters an automation marker which points to an ImagePolicy in a target namespace but can't find the respective ImagePolicy, this should be reported as an error.

I maintain a fairly simple cluster with many namespaces and it would be unnecessary work to create multiple ImageUpdateAutomations if one which target all namespaces would do the same job.

matrix-root commented 2 years ago

@stefanprodan excuse me, do you have any best practices how to do it all? Thanks!

espizo commented 2 years ago

We stumbled upon this issue today as well.

We have the following layout:

apps/
├── app1.yaml
├── app2.yaml
├── kustomization.yaml

Each app*.yaml contains a bunch of different resources and we use namespace: apps in kustomization.yaml to put all of them into the same namespace..

When introducing ImageRepository and ImagePolicy, we needed to find a way to add these to the flux-system namespace (where ImageUpdateAutomation and GitRepository lives) without resorting to hardcoding the namespace for all resources in the app manifests or introducing another overlay.

After some experimentation, we found out that the following in kustomization.yaml does what we want:

Remove namespace: apps

Add patchesJson6902 to put everything into the apps namespace, except ImageRepository and ImagePolicywhich goes to the flux-system namespace.

patchesJson6902:
- target:
kind: .*
name: .*
patch: |-
- op: replace
  path: /metadata/namespace
  value: apps
- target:
kind: ImageRepository
name: .*
patch: |-
- op: replace
  path: /metadata/namespace
  value: flux-system
- target:
kind: ImagePolicy
name: .*
patch: |-
- op: replace
  path: /metadata/namespace
  value: flux-system

We tried to do the same thing in patches as well, but this only worked with patchesJson6902

matrix-root commented 2 years ago

@espizo looks like it's solution, but feels like a crutch 😄

sagikazarmark commented 2 years ago

Came across this issue today as well.

For me, it was natural to put the image repository and policy objects next to the workloads, because:

a) image pull secrets are namespace scoped b) we are planning to move to a multi-tenant model at some point (in which case we won't really have a choice)

We are still thinking about what tenancy model we should choose (repo per tenant, repo per app), but whatever we end up with there will be multiple policies (eg. different policies for different app environments or different applications), potentially in separate namespaces, but applying to the same repository.

This leads to either having to add app/env specific automations along with the policies OR having to move policies closer to the automation (but that crosses namespace and repo boundaries). None of those options seem ideal to me.

As far as I can tell, automations are closer to repositories, while policies are closer to workloads.

stefanprodan commented 2 years ago

We do support cross-namespace refs from ImagePolicies to ImageRepositories, so you can place the ImageRepository where the workloads are to reuse the image pull secret. The ImagePolicies and ImageUpdateAutomation can be placed where the GitRepository is, to reuse the SSH secret. Docs here: https://fluxcd.io/docs/components/image/imagerepositories/#allow-cross-namespace-references

sagikazarmark commented 2 years ago

@stefanprodan That's what I figured and did eventually, but apparently intuition tells people to pair policies with image repositories, not automations (maybe because there is an actual reference in policies to image repositories or because cross-namespace references are relatively rare in the Kubernetes world).

Maybe this is just a documentation issue, not an architectural one.

Frizlab commented 2 years ago

@sagikazarmark I got the same intuition as you and I think a lot of people do, yes.

sagikazarmark commented 2 years ago

Further thinking about this:

The ImagePolicies and ImageUpdateAutomation can be placed where the GitRepository is, to reuse the SSH secret.

This isn't necessarily enough or practical. Let's say I have a main git repository (managed by an ops team) for flux-system and centrally managed stuff and separate repos for each team.

Team A creates an image repository in their git repo and do what? Send a PR to the main git repo for the policy?

The above solution works great as long as there is a single repo or everything is managed by the same team. As soon as there are multiple teams involved, resource placement matters even more.

stefanprodan commented 2 years ago

The above solution works great as long as there is a single repo or everything is managed by the same team. As soon as there are multiple teams involved, resource placement matters even more.

Flux multi-tenancy implies that the tenant’s GitRepository is defined in their namespace, so all image update objects will be in the tenant’s namespace. Also for multi-tenancy, cross-refs must be disable to ensure isolation, docs here: https://github.com/fluxcd/flux2-multi-tenancy

josmare commented 1 year ago

What about if the tenant has several namespaces (test, staging, production)? Sets of GitRepository-ImageUpdateAutomation-ImageRepository-ImagePolicy shall be defined for each single one?

makkes commented 1 year ago

What about if the tenant has several namespaces (test, staging, production)? Sets of GitRepository-ImageUpdateAutomation-ImageRepository-ImagePolicy shall be defined for each single one?

Yes, that's the premise here.

dsanders1234 commented 2 weeks ago

We have a similar situation where we isolate tenants to their own namespace and do not allow cross namespace refs. If we give them access to their GitRepository (which we currently have in flux-system), then the tenant would be able to basically reach across to other folders (i.e. namespaces).

makkes commented 2 weeks ago

We have a similar situation where we isolate tenants to their own namespace and do not allow cross namespace refs. If we give them access to their GitRepository (which we currently have in flux-system), then the tenant would be able to basically reach across to other folders (i.e. namespaces).

Each tenant should have their own Git repository. Case closed.

fluxcd / image-automation-controller

ImageUpdateAutomation in different namespace than ImagePolicy #85