keycloak / keycloak-benchmark

Keycloak Benchmark
https://www.keycloak.org/keycloak-benchmark/
Apache License 2.0
127 stars 70 forks source link

Deploy HCP clusters in parallel. Resolves #765 #766

Closed ryanemerson closed 4 months ago

ryanemerson commented 4 months ago

Resolves #765

https://github.com/ryanemerson/keycloak-benchmark/actions/runs/8613309713

tldr; The simplest solution is assign two CIDRs in the GH action before provisioning individual clusters, the downside being that this can't be executed locally as a single action.

As well as adding parallel cluster creation, the Terraform modules have been simplified to remove unnecessary variables and validation.

Below is a summary of some findings based upon experimenting with Terraform over the last couple of days.

Terraform Solution

Originally I created a new Terraform module that determined available CIDR ranges, and then called the "HCP" module for each cluster. This works well for creating clusters, however it completely breaks the reaper scripts and our "keepalive" checks as two provisioned cluster are now tightly coupled. Of course these can be updated, but this means that either: 1. We drop single cluster provisioning support, or 2. We have separate reaper scripts etc for single and multi-az cluster setups.

POC Branch

Provisioning Kubernetes dependencies with Terraform

I also tried to replace the various *.sh scripts in rosa_create_cluster.sh with Terraform, however there are some serious limitations with this approach:

  1. It's not possible to use a single Terraform module to create ROSA clusters AND provision dependencies. This is because the Kubernetes provider needs to be able to access the K8s clusters as planning stage, which obviously aren't available if they haven't been created yet. The solution is to use separate modules for cluster provisioning and initialization.

  2. The TF Kubernetes provider works well for resources that we fully control, e.g. Subscriptions, however it's not possible to patch existing resources on the cluster. Similarly, it's not possible to explicitly wait for CRDs installed by an Operator as the resource is not installed by the provider.

POC Branch

We can workaround 2. by using kubectl and local-exec, but I think we're adapting a tool to our use-case instead of using the right tool for the job. As we would need to use multiple TF modules to install dependencies etc, I think in the future we may want to consider an alternative tool for managing k8s state (ArgoCD, Helm...) if we want to remove our current *.sh scripts \cc @mhajas

ryanemerson commented 4 months ago

Thanks @kami619