ServiceClusterAssignment v2

thetechnick commented 4 years ago

ServiceClusterAssignment objects are being created implicitly when the first CRD of a ServiceCluster is created in a Namespace on the master cluster. Currently we have no solution for cleaning them up again.

Creating these ServiceClusterAssignment objects lazy also creates issues with the upcoming catapult webhook implementation. Because we don't know the target Namespace in the ServiceCluster, before the object has been created, we cannot validate against the right Namespace in the ServiceCluster.

To solve these issues, we will tie the livecycle of ServiceClusterAssignment object to Namespaces in the master cluster instead of individual service instances.

Adding a kubecarrier.io/assignment/<provider>.<service-cluster> label to a Namespace will automatically create (if not already present) a ServiceClusterAssignment.

kind: Namespace
metadata:
  name: some-namespace
  labels:
    kubecarrier.io/assignment/<provider>.<service-cluster>: 'true'
spec: {}
---
kind: ServiceClusterAssignment
metadata:
  name: some-namespace.<service-cluster>
  namespace: <management-namespace>
spec:
  masterNamespace:
    name: some-namespace
  serviceCluster:
    name: <service-cluster>

Removing the kubecarrier.io/assignment/<provider>.<service-cluster> label or deleting the Namespace will automatically delete the ServiceClusterAssignment as well.

To help administrators find the right ServiceClusterAssignment and service cluster Namespace for each namespace on the master cluster, we should add a few labels and automatically propagate some labels to each object. Because just propagating and overriding all labels, breaks existing workflows, we will just touch labels with a prefix of kubecarrier.io/. The kubecarrier.io/assignment/* label is excluded, as it's not adding information to the ServiceClusterAssignment or service cluster Namespace.

kind: Namespace
metadata:
  name: some-namespace
  labels:
    kubecarrier.io/assignment/<provider>.<service-cluster>: 'true'
    kubecarrier.io/*: 'something'
spec: {}
---
kind: ServiceClusterAssignment
metadata:
  name: some-namespace.<service-cluster>
  namespace: <management-namespace>
  labels:
    kubecarrier.io/*: 'something'
---
kind: Namespace
metadata:
  name: some-namespace
  labels:
    kubecarrier.io/*: 'something'
spec: {}

nmiculinic commented 4 years ago

What's the drawback for automatically create ServiceClusterAssignments for all Accounts( #211 ) Or for all Accounts that is permitted creating services in that service cluster (finding out via the catalog) if we want to prevent a massive number of (unnecessary) namespaces.

Another approach is to modify the previous solution, but lazily create serviceClusterAssignments in the webhooks as well (we just gotta be careful not to immediately clear them up, have some timeout in the place where serviceClusterAssignement can live without being GCed)

As for the termination, what is the biggest issue with cleaning them up? They are uniquely bound to (service cluster, master cluster's namespace) and one of those two is deleted so should be the ServiceClusterAssignment. They could also be deleted once the tenant has stopped using the service cluster, though that can be tricky to detect (e.g. there's remaining CouchDB Backup in the service Cluster...is it safe to delete now? The only way to safely detect if the tenant has stopped using the service cluster if the tenant manually deleted all service cluster's internal CRD instances. We can detect than either by live reference counting or periodic garbage collection jobs)

thetechnick commented 4 years ago

What's the drawback for automatically create ServiceClusterAssignments for all Accounts( #211 ) Or for all Accounts that is permitted creating services in that service cluster (finding out via the catalog) if we want to prevent a massive number of (unnecessary) namespaces.

I don't want to couple this to Accounts (or Tenants/Providers) (yet), because I want the catalog management to be completely independent of the cross-cluster federation part, as we have a few use cases that only require one or the other.

Another approach is to modify the previous solution, but lazily create serviceClusterAssignments in the webhooks as well (we just gotta be careful not to immediately clear them up, have some timeout in the place where serviceClusterAssignement can live without being GCed)

That was indeed a very short lived idea I discussed with @jiachengxu for our webhooks, but object creation might be denied by webhooks later in the call-stack, so introducing any side effects into them is very dangerous and error prone.

As for the termination, what is the biggest issue with cleaning them up? They are uniquely bound to (service cluster, master cluster's namespace) and one of those two is deleted so should be the ServiceClusterAssignment. They could also be deleted once the tenant has stopped using the service cluster, though that can be tricky to detect (e.g. there's remaining CouchDB Backup in the service Cluster...is it safe to delete now? The only way to safely detect if the tenant has stopped using the service cluster if the tenant manually deleted all service cluster's internal CRD instances. We can detect than either by live reference counting or periodic garbage collection jobs)

In my PR I am now coupling the existence of those ServiceClusterAssignment objects to the lifecycle of the Namespace in the master cluster. If this namespace is deleted we will cleanup the SCAs for it (and everything else within this namespace).

nmiculinic commented 4 years ago

I don't want to couple this to Accounts (or Tenants/Providers) (yet), because I want the catalog management to be completely independent of the cross-cluster federation part, as we have a few use cases that only require one or the other.

Ok. I guess Account (#211 #224 ) could create those required annotations on the namespace and be loosely coupled.

That was indeed a very short-lived idea I discussed with @jiachengxu for our webhooks, but object creation might be denied by webhooks later in the call-stack, so introducing any side effects into them is very dangerous and error-prone.

True...sounds icky to debug when(not if) things go wrong.

In my PR I am now coupling the existence of those ServiceClusterAssignment objects to the lifecycle of the Namespace in the master cluster. If this namespace is deleted we will clean up the SCAs for it (and everything else within this namespace).

hmmm, the ServiceClusterAssignment cost, as well as Namespace cost shouldn't really be too high, shouldn't it? It's just an object in the etcd storage. Thus technically we could just have a single annotation on a namespace marking its requirement for cross-cluster propagation.

Later on, if we decide the performance costs are increasing, we could do some further performance improvements:

lazy init....however we could do that at the moment
extension API server, furthermore this could utilize some more specific backend store...though right now I don't think it would ever be needed.

To summarize I agree with ServiceClusterAssignment being tied to the namespace in the master cluster and would prefer having a single annotation on the master cluster's namespace and be done with it. In the end, on each service cluster an O(|tenent|) namespaces shall be created, and in total O(|tenant| * |service cluster|) ServiceClusterAssignment would be created in the master cluster.

Scaling wise that's not ideal...though we could solve lazy ServiceClusterAssignment approach down the line instead of introducing ops configuration options from the get-go. Keeping it simple for the first release?

nmiculinic commented 4 years ago

btw, now that we're on ServiceClusterAssignment refactoring topic...what about created the namespace in the service cluster with the same name as the one in the master cluster, akin to https://docs.google.com/document/d/1hFtp8X7dzVS-JbfA5xuPvI_DNISctEbJSorFnY-nz6o/edit#heading=h.u7jfy9wqpd2b ?

In the master cluster, they would have names as account-2kl21j and it's unlikely they would interfere. A pro is easier cross-cluster debugging, while a con is a possible namespace collision. What do you think?

jiachengxu commented 4 years ago

I agree that we can tie the ServiceClusterAssignment with the namespace in the service cluster. also, as mentioned by @thetechnick , creating the ServiceClusterAssignment in the catalog controller just ties the federation and catalog together again.

what about created the namespace in the service cluster with the same name as the one in the master cluster

I think this is a good approach, because when we try to debug cross-cluster problem, we don't need to check the ServiceClusterAssignment to see the namespace mapping relation between service cluster and master cluster, we can just directly check with the same namespace name

kubermatic / kubecarrier

ServiceClusterAssignment v2 #220