cloudfoundry / cf-crd-explorations

Apache License 2.0
3 stars 2 forks source link

Explore: KubeFed #73

Closed gcapizzi closed 3 years ago

gcapizzi commented 3 years ago

As we try to design a mapping for orgs and spaces (see #69), we are trying to understand how to manage organisations that might not fit in a single cluster. Our current prototype consists of a controller connecting to multiple clusters, reconciling Namespaces, Roles and RoleBindings. This requires the controller to know about all the clusters that are part of the foundation and how to connect to them.

The Kubernetes SIG Multicluster is trying to solve a similar problem: providing a way to federate multiple clusters and reconcile resources across them. They have created a product called KubeFed:

Kubernetes Cluster Federation (KubeFed for short) allows you to coordinate the configuration of multiple Kubernetes clusters from a single set of APIs in a hosting cluster. KubeFed aims to provide mechanisms for expressing which clusters should have their configuration managed and what that configuration should be. The mechanisms that KubeFed provides are intentionally low-level, and intended to be foundational for more complex multicluster use cases such as deploying multi-geo applications and disaster recovery.

Let's take a look at KubeFed and see if it might be useful for our use case. Ideally we would be able to implement orgs and spaces for individual clusters, and then use KubeFed to extend foundations on multiple clusters.

mnitchev commented 3 years ago

We played a bit with kubefed and found the that it allows you to create resources accross multiple clusters with several options:

  1. you can federate a resource and propagate it across all clusters (the same resource will be created on each cluster)
  2. federated resources can be any kuberentes resource including crds, but they need to be enabled
  3. you can generate a federated resource off of the yaml of any resource and then apply it with kubectl
  4. you can only create federated resources in federated namespaces
  5. federated resources can specify which clusters they need to be propagated to. This is done either by naming the specific cluster or with a label selector
  6. resources like deployments can use the ReplicaSchedulingPreference to distribute load accross multiple clusters. 1 a ReplicaSchedulingPreference for every deployment. It also looks like it overrides the deployment's replica count.

We think that this tool could be used with eirini to distribute workloads accross multiple clusters and also to ensure isolation - for example an organization can be isolated on a deticated cluster, while the rest of the organizations can be placed in a shared (or multiple) cluster. We should explore the implication of federated clusters on our authorization (RBAC and OPA) concepts. NOTE: the kubefed product is currently in beta.

danail-branekov commented 3 years ago

KubeFed terms glossary: https://github.com/kubernetes-sigs/kubefed/blob/master/docs/concepts.md#kubefed-concepts

Our test setup consists of two kind clusters:

We can similarly federate arbitrary objects (even the CF ones, such as App), for example RoleBindings:

Each FederatedXXX object has placement field that specifies on which cluster the underlying non-federated object should appear. Options are:

danail-branekov commented 3 years ago

parking for now, we know how kubefed works, now we need to know whether it would be useful for us

georgethebeatle commented 3 years ago

Here are the results of some experiments we ran today. We played with the idea of having sub-eirini level federation vs having super-eirini level federation:

  1. Can eirini federete statefulsets? (sub-eirini federation)

    • No. As written in the docs the replicaschedulingpreference reconciler would only reconcile Deployments and ReplicaSets: https://github.com/kubernetes-sigs/kubefed/blob/master/docs/userguide.md#replicaschedulingpreference
    • It is possible to federate statefulsets without using the replicaschedulingpreference and it results in each cluster having its own 0 instance, which might be the reason why replicaschedulingpreference does not support statefulsets.
    • This means that federation on sub-eirini level is unfeasible unless eirini swithches to deployments
  2. Can we somehow use replicaschedulingpreference for LRPs/Tasks? (super-eirini federation)

    • Can we switch federation on and off by introducing a reconcliler that turns an LRP into a FederatedLRP?

      • This way neither the shim nor eirini will know about federation, but there will be a simple reconciler that just federates Tasks and LRPs. A problem with this approach might be that the app will momentarily appear on the federation host cluster before being scheduled on its destination cluster(s) which might be a security issue
    • How are we going to work around the fact that LRP has an "instances" field while the kubefed reconcilers know how to put "replicas" on the object referred to by a replica scheduling preference?

      • Either migrate to replicas, or write our own "eirini scheduling preference" reconciler that heavily reuses the replicaschedulingpreference reconciler
    • Is replicaschedulingpreference going to work with a modified LRP that has "replicas" instead of "instances"?

      • Yes. However the LRP needs to also have a selector (as in Deployments) since the replicaschedulingpreference expects this. Once we did that, we were able to federate a LRP with a preference of 6 instances an have them distributed around 2 clusters
danail-branekov commented 3 years ago

This way neither the shim nor eirini will know about federation, but there will be a simple reconciler that just federates Tasks and LRPs. A problem with this approach might be that the app will momentarily appear on the federation host cluster before being scheduled on its destination cluster(s) which might be a security issue

We can mitigate this by changing Eirini creation interface to return objects (that are not pushed to k8s). Then the federation bit could be just a wrapper (injected only when the federation switch is on) that transforms the statefulset/deployment object into a federated one and then apply it to k8s.

This is what kubefedctl federate --filename some-deployment.yml does, see here for reference. The federate command just transforms the yaml, pushing the object to k8s is taken care of by an upstream component.

georgethebeatle commented 3 years ago

We created a kubefed multi cluster prototype that features a federation of 3 clusters as follows:

One of the major goals of the prototype is to abstract federation away from CF components as much as possible. The ideal scenario is to have a single switch to turn federation on.

If you want to run the prototype you need the following branches:

While this prototype demonstrates how isolation segments might be implemented by kubefed it has several flaws

We have updated the multicluster proposal with our latest findings and are closing this story for now