cnoe-io / proposals

0 stars 0 forks source link

Proposal to support the bootstrapping and deployment to managed Kubernetes clusters (starting with EKS) in idpBuilder #2

Open nimakaviani opened 3 months ago

nimakaviani commented 3 months ago

Right now, idpBuilder only supports deploying to Kind clusters.

While this works great for development and test purposes, often times users want to either test the IDP on a remote cluster, or they want to bootstrap a staging environment to carry their development work over. This issue proposes that we should take action on the following:

  1. Allow for idpBuilder to deploy to a remote cluster
  2. Support bootstrapping of a managed Kubernetes cluster (starting with EKS)

Allow for idpBuilder to deploy to a remote cluster

During the process of developing an IDP, the idpBuidler deploys a git repository, the Argo workflows, and nginx. Building a more complex IDP involves extending on top of these tools and deploying other Argo applications in the form of a pre-configured "stack". However, the process of deploying the stack to a staging environment requires the users to create a Kubernetes cluster, deploy Argo CD and nginx, and configure their Git repository to work with the tooling. They should then push the developed "stack" to this remote repo and continue on with the work of configuring their stack. This creates a disjoint experience. If the idpBuidler can deploy its core packages as well as the developed stack directly to the core cluster, saving energy and effort for the platform engineers.

Support bootstrapping a managed Kubernetes cluster

The previous proposal requires users to bring in their own managed Kubernetes cluster for the idpBuilder to deploy to. This can be a cumbersome task given that the cluster needs to be created out of band and permissions need to be managed for the cluster to have sufficient permissions when running its IaC tools.

For the EKS, with the introduction of PodIdentity into EKS, it appears that cluster level permissions can be better handled by creating and connecting IAM roles to services accounts. The support is natively available in eksctl as well. Given the new developments, this proposal suggests that we should wrap the eksctl functionality to create an EKS cluster and include PodIdentity add-on as well as capabilities to it in such a way that idpBuilder is capable of bootstrapping and creating EKS clusters.

We can later on extend on this plan and implement the Provider API proposed here https://github.com/cnoe-io/idpbuilder/pull/183 and offer a more generic approach that would support managed Kubernetes clusters other than EKS as well.

blakeromano commented 3 months ago

My thought here is if we are planning on supporting deploying cloud infrastructure, I think folks are going to wonder "how can I make updates to things like my cluster, vpc, subnets" post initial deployment.

I think leveraging backstage software templates as a "bootstrapping tool" where I could, specify an IaC tool (pulumi, eksctl, terraform, etc) a destination (eks, aks, gke, etc) and a set of configurations. We could utilize our ref implementations and potentially look at using Argo workflows in the cluster to do the initial IaC commands but ultimately bringing anything that CNOE does via IDPBuilder back to Git.

If we wanted to do this via IDPbuilder to be able to dump configuration files from Gitea and maybe ref implementations as well into a GitHub/GitLab etc; I think that may be a valid idea for a developer to have an easy to use way to deploy what they are using locally on to a cloud based environment.

I feel like many organizations are just starting to mandate everything be done with some trace in IaC. Where CNOE is adopting GitOps if there is not some trace of what IDPBuilder has done (at least as an option) output into a long standing Git server then I think we will see folks struggle to adopt it long term for production (or maybe even sustained non-production) uses. That's really my main concern with this proposal.

nimakaviani commented 3 months ago

As with enabling cluster bootstrapping with an IaC and the need to use GitOps, I agree that generally there is a shift in using GitOps to manage infrastructure, including workload K8s clusters. However from our experience, I haven't seen folks use GitOps to spin up control plane clusters, because it becomes a chicken-and-egg problem. Unless one relies on some GitHub or GitLab automation (or an already existing Backstage deployment), there wont be any existing GitOps tool to spin up a cluster with. Same for an existing Backstage. This proposal captures control-plane-cluster-0 with no pre-existing developer platform tooling.

Unless you mean for the local instance of the idpBuilder to bootstrap a managed K8s control plane with its built-in Backstage and Argo CD tooling. If so, this was the other idea we discussed. I am not completely sold on the pattern though because it breaks the interaction model between a local cluster and a staging cluster, making it harder for people to shift their platform work over to this staging cluster.

In terms of managing networking, IAM, and other stuff, my thinking was that we will just piggy back on what eksctl does and stop where it stops. There is some level of updates to the configuration of a Kubernetes cluster that can be managed via the eksctl config file. Beyond that, supporting the infra for the control plane Kubernetes cluster should be a non-goal for idpBuilder and our tooling.

I feel like many organizations are just starting to mandate everything be done with some trace in IaC. Where CNOE is adopting GitOps if there is not some trace of what IDPBuilder has done (at least as an option) output into a long standing Git server then I think we will see folks struggle to adopt it long term for production (or maybe even sustained non-production) uses. That's really my main concern with this proposal.

100%. The idpBuilder should create a repo and upload the latest ekstcl config it uses for the control plane cluster to a git repository. From there on, people should take charge managing the cluster based on the configuration available.

niallthomson commented 3 months ago

Still reading through and pondering the overall concept here but:

There is some level of updates to the configuration of a Kubernetes cluster that can be managed via the eksctl config file.

My experience is that eksctl is not really suited to GitOps, maintenance quickly becomes pretty imperative. I'm not sure focusing on that tool here is perhaps the best idea.