Open vedantthapa opened 10 months ago
For flux to apply a manifest on a remote cluster, a k8s secret with kubeConfig
of the remote cluster is required.
The kubeConfig
secret returned by the cluster resource via crossplane's gcp provider, by default, doesn't contain user
config data. One way to approach this is by issuing client certificates during cluster creation, that would take care of authn and then use provider-kubernetes
to configure authz for CN of the certificate. However, producing client certificates goes against GKE's security guidelines.
Another way is to "construct" a kubeConfig
by using status.atProvider.endpoint
and status.atProvider.masterAuth.clusterCaCertificate
from the cluster resource and, passing these values at runtime to a k8s secret template which looks something like this (See point 5). Notice that this kubeConfig
definition uses ExecConfig
and gke-gcloud-auth-plugin
(this is also how your local setup is configured, try doing cat ~/.kube/config
). An upside to this is - it follows one of google's recommended ways of authenticating to the API server. However, flux requires the kubeConfig
ref to be self-contained and not rely on binaries, environment, or credential files (See the note here). We can configure the gke-gcloud-auth-plugin
on the kustomize-controller
but that means either - every project has a common service account from the management project to control flux installation manifest's reconciliation (this breaks project boundaries) OR have multiple kustomize-controllers each with their own gke-gcloud-auth-plugin
configuration to reconcile flux installation manifest (this would be too expensive).
To get around the concerns mentioned above, we can use a hybrid of the above two approaches i.e, use provider-kubernetes
to create a new k8s service account, add a rolebinding, generate a token against that service account, write it back to the management cluster and finally pass it to the kubeconfig
token field in the k8s secret template.
An alternative is to do away with bootstrapping flux as per the above mentioned method and use provider-helm
with community contributed flux charts, which then comes with the black box-y-ness nature of helm.
So after some internal discussions, it seems like it'd be more useful to have flux reconciled against the client repository as opposed to the previously proposed approaches of having it reconciled to a central place. The reason for this is; anything beyond GCP resources should be considered as application specific tools. Having multiple flux instances from different remote clusters reconciling to a single flux repo increases the maintenance complexity for our team. Plus, it'd be difficult to "customize" flux deployment on a remote cluster if there's ever a need for that. For example, one project might need the helm-controller
, whereas others won't.
The solution then is to have each flux instance reconcile to it's own client repo. However, as noted previously, that means configuring deploy keys. One way to approach this is by using a crossplane composition that would create the remote k8s cluster and then use provider-kubernetes
to execute a one-time job that configures the deploy keys and bootstraps flux.
Fundamentally, this approach is similar to ANZ's Google Next demo. However, instead of github actions we'd use a kubernetes job within a crossplane composition. This is due to the security concerns around github's location. Moreover, the demo uses token auth, i.e, one highly privileged token would need to have access to all the repositories in the org and would be stored as a k8s secret in each cluster that uses this template.
On the other hand, a k8s job that's part of the crossplane composition can run a shell script that -
flux bootstrap
commandThe job runs on the management cluster, therefore, the highly privileged org-wide token only lives in the management cluster.
The end result might looks something like this:
apiVersion: dip.phac.gc.ca/v1beta1
kind: XFluxGKE
metadata:
name: alpha
spec:
name: alpha # cluster name on gcp
projectId: phsp-fb3a2b560a617fbf # project id where the cluster would be created
xnetwork: # network config for the cluster
networkId: projects/phsp-fb3a2b560a617fbf/global/networks/alpha-vpc
subnetworkId: projects/phsp-fb3a2b560a617fbf/regions/northamerica-northeast1/subnetworks/alpha-vpc
repoName: cpho-phase2 # repo name that resolves to ssh://git@github.com/PHACDataHub/<repoName>
A cluster resource by the crossplane gcp provider can be used in conjunction with an
XProject
andXNetwork
resources to initialize a new project space with the desired networking setup and a GKE autopilot cluster.Flux can be installed on the remote cluster by leveraging the
kubeConfig
exported by previously created cluster resource and,provider-kubernetes
to apply management repo's flux sync manifest that has akubeConfig
reference.This implies that the remote cluster's flux deployment will be in sync (managed) with the management cluster and the application team would be responsible for configuring GitOps on their repo i.e, a new
GitRepository
resource pointing to their repo. An upside to this is - we don't need worry about access or bootstrapping deploy keys on client / application repo.In addition to this, the cluster can be added to the fleet-monitoring project to provide centralized monitoring of all clusters.
This would probably include a template on both crossplane and backstage side.