akuity / kargo

Application lifecycle orchestration
https://kargo.akuity.io/
Apache License 2.0
1.55k stars 133 forks source link

Self-service mgmt of Kargo projects in multi-tenant cluster #2058

Closed jkleinlercher closed 1 month ago

jkleinlercher commented 4 months ago

Checklist

Proposed Feature

I would like to have kargo projects like argocd applications to be a namespace scoped resource

Motivation

in a multi-tenant cluster cluster-scoped ressources are always hard to manage outside of the platform team. Since kargo projects are things a dev team should create we want the highest possible self service for it. And since argocd apps and app namespaces are also in the hands of the devs without granting them cluster scoped resources it should also be possible to get kargo projects managed by dev teams.

Suggested Implementation

krancour commented 4 months ago

Hi @jkleinlercher. Thanks for the interest in the project.

Could you possibly elaborate on why granting permissions to create Project resources is problematic? Namespaces are cluster-scoped resources as well and it seems you've already got developers responsible for / able to create those.

jkleinlercher commented 4 months ago

Hi @krancour !

I can try to explain how we achieved a save procedure so that dev-teams can create namespaces and argocd apps on a multi-tenant cluster as a self-service by themselves without influencing other teams. This can be established with apps-in-any-namespace:

a platform-team onboards a dev-team by

  1. extending the value application.namespaces with a "app-definition namespace" for this team (e.g. "team1-app-definitions" )
  2. creating this "app-definition namespace" for this team (e.g. "team1-app-definitions")
  3. creating an argo app-project for this team (e.g. team1-project), references the "app-definition namespace" in the projects sourceNamespaces attribute, sets the destinations in the project to valid "workload namespaces pattern“, like "team1-*" and sets clusterResourceWhitelist to "kind: Namespace".
  4. optionally (but not necessary for the self-service aspect) the platform-team can create some multi-tenancy kyverno generate policies so limit-ranges, quotas, deny-all network-policies etc. are automatically created for new namespaces.

Then the dev-team can create new apps by their own in their "app-definition namespace" and set the sync option "CreateNamespace". As long as the namespace name matches the valid destinations in the argo project („team1-*“) the namespace gets created. However, the dev-team team1 still cannot create a namespace like team2-app1-prod, because it doesn‘t match the „team1-*“ pattern defined in the teams argocd app-project. So this restriction exists because of the destination in the argocd app-project, not because of the clusterResourceWhitelist "kind: Namespace".

For the kargo project we only can set clusterResourceWhitelist "kind: Project" and the dev-team can create/change/delete kargo projects from every other dev-team on this cluster.

krancour commented 4 months ago

Thank you for the details. You've described the creation of namespaces in the Argo CD control plane(s) (by the platform team) and creation of namespaces in the applications clusters (by dev teams via Argo CD), but your Kargo resources don't realistically belong in either of those places.

Kargo is most often used to orchestrate promotions through pipelines that span multiple clusters, and may even span multiple Argo CD control planes. With this being the case, Kargo resources belong somewhere else, i.e. in Kargo's own control plane.

In the end, your best bet may simply be to have the platform team create an Argo CD Project and Application per team (in the Argo CD that manages the Kargo control plane itself, that is) and this would allow all teams to gitops their own pipelines without requiring any direct access to the Kargo control plane.

And... since Kargo creates namespaces for each Project and resources like Warehouses, Stages, etc. go in those Project-specific namespaces, I am also pretty sure that it will "just work" if you use the App Project for each team to limit destinations to Kargo Project namespaces that match your desired Project naming conventions. It may not actually prevent creation of a Project/namespace that defies those conventions, but it would prevent creating resources in a Project/namespace that defies conventions -- which I would say is a pretty strong incentive to the dev teams to follow them.

jkleinlercher commented 4 months ago

In the end, your best bet may simply be to have the platform team create an Argo CD Project and Application per team (in the Argo CD that manages the Kargo control plane itself, that is) and this would allow all teams to gitops their own pipelines without requiring any direct access to the Kargo control plane.

Thanks for your detailed explanation! So as I understand it (and for simplification I just talk about one cluster if one ArgoCD instance which manages the whole cluster) the platform team creates a ArgoCD app-project and an app for Kargo resources with limiting namespace destination with pattern „team1-* and allows namespace and kargo projects in clusterResourceWhitelist. the dev-team could create the Kargo projects with name „other-team-i-dont-belong“ and Kargo controller would create a namespace „other-team-i-dont-belong“ but then the dev-team is not able to create warehouse, stages, … in this namespace because it doesn’t match the destination pattern in the ArgoCD app-project. Right? so that indirectly forces the dev-teams to create Kargo projects with the naming convention <teamname>-… and so the different dev-teams don‘t influence each other.

still, a dev-team is able to create useless Kargo projects which they cannot use, in the first place with this approach. But someone can argument that is not ideal, but also not very bad … do I unterstand you correct?

krancour commented 4 months ago

Yes. You understand it correctly. There may be additional things you could do with Kyverno as well -- I am not so much of an expert on that.

Namespaced Project just aren't going to happen because it's such a massive architectural shift that we would miss internal milestones. But with that said, I don't think this is a damning limitation and I'd be happy for us to continue using this thread for exploring reasonable methods of administering self-service Project management for dev teams.

jkleinlercher commented 4 months ago

Cool! I would also appreciate to keep this conversation in this thread and explore how to achieve a good self-service for dev-teams with ArgoCD and kargo. Out platform demo repo is https://github.com/suxess-it/sx-cnp-oss and I will integrate the things we said asap.

jkleinlercher commented 4 months ago

I also have a chat https://kubernetes.slack.com/archives/CLGR9BJU9/p1716676092231239?thread_ts=1716318273.333099&cid=CLGR9BJU9 where I discuss if the kargo project creation could be prevented by kyverno

jkleinlercher commented 4 months ago

with the policy https://github.com/suxess-it/sx-cnp-oss/blob/main/platform-apps/charts/kyverno/templates/policy-kargo-project-name-validation-apps-in-any-ns.yaml I was able to check if the kargo project matches the allowed destinations in the corresponding argo app-project. Currently this polices is very specific to our environment and also the apps-in-any-namespace naming conventions but as a inspiration it could work also for the community.

Then the argocd sync gets already blocked and no kargo project or corresponding namespace gets created:

image

krancour commented 4 months ago

We're starting to think about adding "User Reference" and "Operator Reference" branches to the doctree, and I think this would be a valuable addition.

jkleinlercher commented 4 months ago

Great! Since I know now how kyverno works, I can maybe implement a more flexible version which works with and without apps-in-any-namespace, and checks labels or annotations for resource tracking

jkleinlercher commented 4 months ago

@krancour is there an idea when to share kargo projects for multiple applications? Or is it always a 1:1 mapping between application and kargo project?

krancour commented 4 months ago

I'm not sure whether you're asking about "little a applications" or "Big A Applications" (i.e. Argo CD Application resources).

Either way, the answer is that you already can.

Three things to take note of:

So there's flexibility at every level to build things out as you see fit.

The caveat on the third bullet is that although the back end has always supported it, the UI has not had great support up until now for Projects containing multiple pipelines. That is set to change in v0.7.0, which we hope to be releasing this Friday.

jkleinlercher commented 3 months ago

@krancour I just recognized that even though kyverno blocks to create a kargo project based on the rule in https://github.com/akuity/kargo/issues/2058#issuecomment-2134547220, the namespace for this kargo project gets created. Could it be that the kargo controller which creates the namespace creates it even though the kargo project is not allowed by kyverno?

krancour commented 3 months ago

@jkleinlercher it is probably the webhook server doing it.

For multiple resources in a single manifest, k8s creates/updates the resources synchronously and in the order they appear. For this reason, people have a tendency to put a Namespace at the top of a manifest and follow it with other resources that go in the Namespace.

Because creation of resources in a Project's Namespace cannot proceed until the Namespace exists, Projects also are typically defined at the top of a manifest, and Namespace creation cannot be left solely to the management controller, which will, of course, reconcile Projects and create Namespaces asynchronously. So the webhook server actually ensure the existence of a Project's Namespace synchronously.

So I suppose what must be happening is that the Kargo webhook server is intercepting the Project create request before Kyverno's admission controls intercept it.

jkleinlercher commented 3 months ago

Hm since your webhook is a mutatingwebhook and mine is a validatingwebhook my kyverno is always called after your Kargo webhook. Therefore I guess I also need to create a validating webhook for the namespace kargo creates from the kargo project. I will try this.

krancour commented 3 months ago

@jkleinlercher the webhook in question is a validating webhook.

jkleinlercher commented 3 months ago

Okay, all validating webhooks are executed in parallel. However, since your validating webhook creates a new resource (a namespace) this API call again goes through all dynamic admission controllers and I have a chance to block the creation of the namespace.

krancour commented 3 months ago

@jkleinlercher what would be the criteria for blocking it or not? The create namespace request is going to always appear to have originated from a Kargo webhook. I don't think you'll be able to know who the end-user is.

It also just occurred to me that #2076 might offer you some options as well. It will allow Projects to "adopt" existing namespaces. So you could create the namespaces explicitly (before the Project) and unregister the Project webhook maybe?

krancour commented 1 month ago

This was a productive, but broadly scoped conversation. I'm going to close it for the sake of keeping the backlog tidy and focused in actionable issues. Please do feel free to open discrete issues if there are components of this conversation that may still need to be addressed.