GoogleCloudPlatform / k8s-multicluster-ingress

kubemci: Command line tool to configure L7 load balancers using multiple kubernetes clusters
Apache License 2.0
377 stars 68 forks source link

Allow users to remove a cluster from an existing multicluster ingress #58

Closed nikhiljindal closed 6 years ago

nikhiljindal commented 6 years ago

Scenario: User creates a multicluster ingress in clusters A, B and C. Now user wants to remove cluster C so that the multicluster ingress should be restricted to clusters A and B only. If user runs kubemci create command again with clusters.yaml containing clusters A and B only, then GCLB will be updated correctly to send traffic only to clusters A and B, but the ingress resource will still exist in cluster C.

Current solution: Delete the multicluster ingress all together (from all clusters) and recreate it in clusters A and B.

This is bad since this leads to downtime. Service will be unavailable while the multicluster ingress is being deleted and recreated. We need a better solution.

cc @csbell @G-Harmon @madhusudancs @mdelio

nikhiljindal commented 6 years ago

Possible solutions:

Add a --old-clusters flag to kubemci create

When this flag is present, kubemci will also delete the ingress from clusters that are in old-clusters list but not in --clusters list.

Add a kubemci remove-clusters command

We can add a specific remove-clusters command to remove clusters from existing multicluster ingresses. (+) Provides guarantee that only the list of clusters that the multicluster ingress is spread to is changed. Nothing else should change. (-) This will add extra complexity for consumers if they are using kubemci in automatic watch triggered mode. They will need to understand that a cluster was removed from the list and hence call kubemci remove-clusters with only the list of clusters from which it needs to be deleted.

csbell commented 6 years ago

This issue is particularly thorny. The CUJ is poor with either approach. Can you think of a way to leave breadcrumbs somewhere as part of the creation of the ingresses so that we can detect changes to the cluster list without requiring explicit user input? For example, can we somehow fit cluster info in the description field of a label on the global forwarding rule?

nikhiljindal commented 6 years ago

Yes we do that for cluster names already. We store cluster names on the description field of forwarding rule (we use them for get-status output). So we can detect that cluster list changed.

But we do not (and do not want to) store credentials for those clusters anywhere. Unless we store credentials, we will need user input.

madhusudancs commented 6 years ago

There are challenges in storing cluster info and using it later, esp. in kubemci's current form. Currently the tool uses local kubecontext to derive cluster information. These contexts are local to the machine where the create command is run. A different user might have different context and cluster names. There is no easy way to uniquely identify a cluster from the client side across machines. And this problem is not unique to multi-user scenarios. Even a single user can have different context/cluster names on different machines or they can rename context/cluster names on the same machine. We either want something akin to cluster identity or a centralized store such as cluster registry where clusters can be uniquely identified if we want to associate LB resources with cluster info.

G-Harmon commented 6 years ago

Forgive the newbie question- What happens today if you do "kubemci create --force" with a pruned cluster list? Why does that cause downtime?

nikhiljindal commented 6 years ago

Forgive the newbie question- What happens today if you do "kubemci create --force" with a pruned cluster list? Why does that cause downtime?

It does not cause downtime. But the ingress will not be deleted from the cluster which got pruned. kubemci will just ignore that cluster. So if ingress was there before, it will still remain there. All GCP resources will be updated though to not send traffic to that pruned cluster.

nikhiljindal commented 6 years ago

cc @mdelio Can you chime in on what you think we should do to provide a good user experience.

mdelio commented 6 years ago

I like the remove-clusters approach; it's most intuitive. A few thoughts from offline conversation with @nikhiljindal:

nikhiljindal commented 6 years ago

https://github.com/GoogleCloudPlatform/k8s-multicluster-ingress/pull/146 is adding the remove-clusters command

nikhiljindal commented 6 years ago

This is now fixed