antrea-io / antrea

Kubernetes networking based on Open vSwitch
https://antrea.io
Apache License 2.0
1.66k stars 368 forks source link

Multi-cluster antctl command refactor #3921

Closed hjiajing closed 2 years ago

hjiajing commented 2 years ago

Describe the problem/challenge you have

In antrea v1.7.0. The antctl supports theantctl mc deploy command to deploy the Deployment and define the CRDs in a leader or member cluster. But only Deployment is not enough, we still need to define the ClusterClaim and ClusterSet to set up the Antrea Multi-cluster. So, there will be at least 3 steps to build an Antrea Multi-cluster.

In the ideal conditions, we want to set up an Antrea Multi-cluster in one command. And the improvement of the name of ClusterSet and ClusterClaim makes it easier to do this.

Describe the solution you'd like

In the latest user guide, the name of ClusterSet and ClusterClaim is "id.k8s.io" and "clusterset.k8s.io". It means that a logical cluster is exclusive to a namespace. We cannot deploy two ClusterSet to a namespace. So once we want to deploy an Antrea Multi-cluster to a namespace, we just need to create one ClusterSet and two ClusterCliam directly. And what we care about is the Namespace and the version to deploy.

So, maybe we can introduce a new subcommand antctl mc init to initialize an Antrea Multi-cluster easily. The command looks like this:

 antctl mc init --antrea-version <ANTREA_VERSION> --kubeconfig-dir <KUBECONFIG_DIR> --leader-cluster <LEADER_CLUSTER_CONFIG>

The antrea-version is the version to deploy, and the kubeconfig-dir is the directory of the kubeconfig files. The leader-cluster means that kubeconfig is the specified config of the leader cluster. About the name of the clusters and the resources, we just follow the names of the kubeconfigs. For example: If there are three kubeconfigs in the directory, "west", "east", and "north". And the leader-cluster is "north". So that the names of member clusters will be "west" and "east". Then the resources in multi-cluster will be "member-east-access-rolebinding" or "member-west-access-rolebinding" and so on.

In this way, we could set up an Antrea Multi-cluster easier with less arguments as following codes:

~/C/c/kubernetes# antctl mc init --leader-cluster test-1  --kubeconfig-dir=./kubeconfigs          ✔  with root@ubuntu  at 14:39:13  
Deploying Antrea Multi-cluster to the leader cluster test-1
CustomResourceDefinition/clusterclaims.multicluster.crd.antrea.io created
CustomResourceDefinition/clustersets.multicluster.crd.antrea.io created
CustomResourceDefinition/memberclusterannounces.multicluster.crd.antrea.io created
CustomResourceDefinition/resourceexports.multicluster.crd.antrea.io created
CustomResourceDefinition/resourceimports.multicluster.crd.antrea.io created
ServiceAccount/antrea-mc-controller created
ServiceAccount/antrea-mc-member-access-sa created
Role/antrea-mc-controller-role created
Role/antrea-mc-member-cluster-role created
ClusterRole/antrea-multicluster-antrea-mc-controller-webhook-role created
RoleBinding/antrea-mc-controller-rolebinding created
RoleBinding/antrea-mc-member-cluster-rolebinding created
ClusterRoleBinding/antrea-multicluster-antrea-mc-controller-webhook-rolebinding created
ConfigMap/antrea-mc-controller-config-t2b9525b7f created
Service/antrea-mc-webhook-service created
Deployment/antrea-mc-controller created
MutatingWebhookConfiguration/antrea-multicluster-antrea-mc-mutating-webhook-configuration created
ValidatingWebhookConfiguration/antrea-multicluster-antrea-mc-validating-webhook-configuration created
waiting for Antrea Multicluster Deployment rollout
Antrea Multicluster controller is running
Antrea Multi-cluster Controller deployed in cluster test-1
Deploying Antrea Multi-cluster to member cluster test-2
CustomResourceDefinition/clusterclaims.multicluster.crd.antrea.io created
CustomResourceDefinition/clusterinfoimports.multicluster.crd.antrea.io created
CustomResourceDefinition/clustersets.multicluster.crd.antrea.io created
CustomResourceDefinition/gateways.multicluster.crd.antrea.io created
CustomResourceDefinition/serviceexports.multicluster.x-k8s.io created
CustomResourceDefinition/serviceimports.multicluster.x-k8s.io created
ServiceAccount/antrea-mc-controller created
ClusterRole/antrea-mc-controller-role created
ClusterRoleBinding/antrea-mc-controller-rolebinding created
ConfigMap/antrea-mc-controller-config-t2b9525b7f created
Service/antrea-mc-webhook-service created
Deployment/antrea-mc-controller created
MutatingWebhookConfiguration/antrea-mc-mutating-webhook-configuration created
ValidatingWebhookConfiguration/antrea-mc-validating-webhook-configuration created
waiting for Antrea Multicluster Deployment rollout
Antrea Multicluster controller is running
Antrea Multi-cluster Controller deployed in cluster test-2
Deploying Antrea Multi-cluster to member cluster test-3
CustomResourceDefinition/clusterclaims.multicluster.crd.antrea.io created
CustomResourceDefinition/clusterinfoimports.multicluster.crd.antrea.io created
CustomResourceDefinition/clustersets.multicluster.crd.antrea.io created
CustomResourceDefinition/gateways.multicluster.crd.antrea.io created
CustomResourceDefinition/serviceexports.multicluster.x-k8s.io created
CustomResourceDefinition/serviceimports.multicluster.x-k8s.io created
ServiceAccount/antrea-mc-controller created
ClusterRole/antrea-mc-controller-role created
ClusterRoleBinding/antrea-mc-controller-rolebinding created
ConfigMap/antrea-mc-controller-config-t2b9525b7f created
Service/antrea-mc-webhook-service created
Deployment/antrea-mc-controller created
MutatingWebhookConfiguration/antrea-mc-mutating-webhook-configuration created
ValidatingWebhookConfiguration/antrea-mc-validating-webhook-configuration created
waiting for Antrea Multicluster Deployment rollout
Antrea Multicluster controller is running
Antrea Multi-cluster Controller deployed in cluster test-3
Creating access token for member cluster test-2
Creating ServiceAccount "member-test-2-access-sa" in leader cluster
ServiceAccount "member-test-2-access-sa" created
Creating RoleBinding "member-test-2-rolebinding" in leader cluster
RoleBinding "member-test-2-rolebinding" created
Creating Secret "member-test-2-access-token" in leader cluster
Secret "member-test-2-access-token" created
Creating access-token in member cluster test-2
Creating access token for member cluster test-3
Creating ServiceAccount "member-test-3-access-sa" in leader cluster
ServiceAccount "member-test-3-access-sa" created
Creating RoleBinding "member-test-3-rolebinding" in leader cluster
RoleBinding "member-test-3-rolebinding" created
Creating Secret "member-test-3-access-token" in leader cluster
Secret "member-test-3-access-token" created
Creating access-token in member cluster test-3
Secret created in the leader and member clusters
Creating ClusterSet in member cluster test-2
ClusterClaim "id.k8s.io" created
ClusterClaim "clusterset.k8s.io" created
ClusterSet "test-2" created
ClusterClaim and ClusterSet in cluster test-2 deployed
Creating ClusterSet in member cluster test-3
ClusterClaim "id.k8s.io" created
ClusterClaim "clusterset.k8s.io" created
ClusterSet "test-3" created
ClusterClaim and ClusterSet in cluster test-3 deployed
Creating ClusterSet in leader cluster test-1
ClusterClaim "id.k8s.io" created
ClusterClaim "clusterset.k8s.io" created
ClusterSet "test-1" created
ClusterClaim and ClusterSet in cluster test-1 deployed

Anything else you would like to add?

hjiajing commented 2 years ago

@luolanzone @jianjuns Would you please help to review ? Thanks .

jianjuns commented 2 years ago

I did not get how one antctl mc init --leader-cluster test-1 --kubeconfig-dir=./kubeconfigs command can deploy MC Controller in the leader and all member clusters?

I was thinking about 3 commands (in consistent with the steps in our quick-start guide):

  1. deploy to deploy MC Controller - in either leader or member.
  2. create clusterset to create the ClusterSet and ClusterClaim - either leaser or member; and optionally create a shared SA - in leader only
  3. add member to add a member cluster (if we want we can support multiple members too) to a ClusterSet, and optionally create a per member SA - in leader only.

Basically, extend the existing commands and combine some operations.

luolanzone commented 2 years ago

@hjiajing I feel it would be error prone to let antctl mc init do all deployements. Have you checked how do you handle the error case if one or two deployment failed? I am thinking we may follow the quick start guide as @jianjuns suggested. I feel you can provide a config template and script to call antctl mc for user to set up the ClusterSet.

a config template might be like below:

clusterset="<clustersetid>"
leader="<leaderid>"
leader_server="http://127.0.0.1:6443"
leader_kubeconfig_path="/user/foo/.kube/leader"
member1="<memberid>"
member1_kubeconfig_path="/user/foo/.kube/member1"
member2="<memberid>"
member2_kubeconfig_path="/user/foo/.kube/member2"
...

and a script to call antctl mc with above configs, maybe we can provide a few option to allow user to skip member/leader deployment.

@jianjuns any suggestion for this proposal? thanks.

hjiajing commented 2 years ago

@jianjuns The command will deploy the MC Controllers to all clusters because it will create clients for each cluster. The argument --leader-cluster specifies which cluster is the leader cluster (to deploy the leader-cluster manifests), and others are member clusters. So the antctl mc init will deploy the YAML files to all clusters in a loop.

hjiajing commented 2 years ago

@luolanzone I'm trying to set up an Antrea Multi-cluster with as less parameters as I can. Maybe the fewer arguments, the easier to use.
In Antrea-Multicluster, the ServiceAccount, RoleBinding, and the Secret only will be used multi-cluster cases. So maybe we can just create them in a fixed format. Such as "member-{CLUSTER_NAME}-access-sa" and so on. As well as the ClusterClaim and the ClusterSet. In this way, we can reduce the number of parameters.

hjiajing commented 2 years ago

@luolanzone I have thought of deploying the Deployment by scripts. But there will be a problem that the users must get the scripts to deploy the Antrea Multicluster. Once a new version is released, the users will just get the antctl binary in the release lists. And the deployment of the Antrea Multicluster likes a transaction. Maybe just using one command is more reasonable?

jianjuns commented 2 years ago

Are you saying users must put all kubeconf files in one directory?

I would first build the basic primitives that hide internal details (e.g. ClusterSet, ClusterClaim CRD, RBAC) and automate relevant sub-steps, but still provide enough flexibility for users to do what they want, e.g.: connect to different clusters from different hosts with different config files (rather than requiring a single host and a shared config dir to access all clusters); add/delete clusters on demand (rather than requiring all clusters must be added in one step; allow shared or dedicated SA/token; other customization.

That is why I am proposing 3 commands/steps, following the quick-start guide, which in my mind easier to follow and still allows customization and on-demand changes.

If we want we can also do an all-in-one command or script as Lan suggested after the basic primitives, with reasonable restrictions (e.g. I am not sure if all kubeconfig in one dir is good or not). But in my mind it is not very important.

jianjuns commented 2 years ago

In general, I feel no issue for users to understand: deploy controller, create clusterset, add member. It is no difference from bootstrapping any other clustering system (e.g. it is like how kubeadmin set up a K8s cluster). Users can build their own further automation with these primitives and in many cases they do know how to do that (rather than using our all-in-one implementation).

One thing we can simplify is to let each member join the ClusterSet itself, rather than requiring an "add member" step in the leader. @luolanzone : could you think about that?

luolanzone commented 2 years ago

@jianjuns regarding this part let each member join the ClusterSet itself, rather than requiring an "add member" step in the leader., do you mean to skip the step of updating member list in ClusterSet in leader when there is a new member coming?

jianjuns commented 2 years ago

Yes, could we let leader mc-controller automatically update that? Why a manual step helps?

luolanzone commented 2 years ago

Yeah, I agree with you, I will check this part to see how to refine it.

hjiajing commented 2 years ago

@jianjuns After the above discussion, I changed my design and wrote some draft code. The reason I want to combine three steps (antctl mc create, antctl mc deploy...) into one step is that The user will not just create a secret, but will continue to create controllers and other resources after creating the Secret. A single Secret doesn't make much sense, as well as ServiceAccount and other resources. These three commands are a whole, and users will hardly use only one of them.

So, I think maybe we can combine these three steps into one step for users to set up Antrea Multi-cluster easier. Maybe we can use antctl mc init to quickly create a standardized Antrea Multicluster with standard Namespace, standard ServiceAccount and standard Cluster name. The users will create Antrea Multicluster in one step like this:

antctl mc init --kubeconfig-dir <PATH> --leader-cluster <NAME>

More customized arguments are also optional for users. Maybe we can add a new argument --config for the config file's path (Like Lan's suggestions). The user can customize the details in the configuration file. The template of config file will be like this:

kind: Antrea-multicluster
version: antrea-mc.x-k8s.io/v1alpha1
kubeconfigs:
    clusterName: path
    clusterName: path
secrets:
    clusterName: secret
    clusterName: secret
namespace:
    clusterName: namespace
    clusterName: namespace
.....

The users could use the following command to set up an Antrea Multicluster.

antctl mc init --config <PATH>

As for the other commands, like antctl mc add, antctl mc deploy, and antctl mc create. Maybe we can keep the add command, and replace the others with antctl mc init.

The antctl create command is too complicated because of too many parameters. Maybe replacing them with a short command with a config file is more convenient.

jianjuns commented 2 years ago

A single Secret doesn't make much sense, as well as ServiceAccount and other resources.

What you mean by a single Secret does not make sense? In many cases, a shared Secret works well. It is just like "kubeadm init" generates a single token for all Nodes to join the cluster.

These three commands are a whole, and users will hardly use only one of them. I do not fully agree. At least, I would not require a single host and config directory to access all clusters, which are not even possible in some cases. So, it means we at least to run the commands in leader and member clusters separately.

But I do like to reduce the steps, and suggested to remove the step of adding a member cluster in the leader cluster, but allow a member cluster to self-join a ClusterSet. So, we just need two commands:

  1. "deploy" to deploy MC controllers;
  2. "create clusterset" (or "init" in leader and "join" in member as kubeadmin) to create/join a clusterset. For the leader, the command can generate a shared SA/token for member to join the clusterset.

Technically we can merge the two commands to one "init" command as you suggested, but I personally feel it is quite clean to separate deployment and ClusterSet creation, and that is also easier to implement and use (e.g. to handle failure and rollback, to avoid long argument list, to remove/change the ClusterSet after deployment). And again, I strongly believe we need to run the commands separately in leader and member, just like most other clustering solutions (e.g. kubeadmin).

hjiajing commented 2 years ago

I do not fully agree. At least, I would not require a single host and config directory to access all clusters, which are not even possible in some cases. So, it means we at least to run the commands in leader and member clusters separately.

I agree with you. The member clusters may be maintained by different users or teams. It's not easy, even not possible to store the kubeconfigs in one directory or one host.

So the member cluster has to set up itself. Thanks for Jianjun's suggestion. I think two steps is a good way. I will make some tests. And then, while creating the resources I still think there are too many arguments. Even longer than one line in the terminal. Maybe config file is helpful.

jianjuns commented 2 years ago

@hjiajing Just to add, I meant we need MC controller changes to remove the "add member" step in leader. I believe @luolanzone is looking at that.

luolanzone commented 2 years ago

@jianjuns I have synced with @hjiajing regarding the change to remove the "add member" step in leader, he will work on this part with antctl change together.

hjiajing commented 2 years ago

Design details are as follows:

current commands

In our current antctl mc commands, there will be a problem that the leader cluster needs to know the member cluster list. Once the Antrea multicluster is running, if the users want to add a new member cluster they need to modify the ClusterSet in the leader cluster. So, adding a member cluster need both leader and member cluster's operations. But sometimes the member cluster cannot operate the leader cluster.

new commands

In order to avoid the situations mentioned above. We added 4 commands and removed some commands. The new commands are as follows:

When we set up an Antrea multicluster. There are some steps:

  1. Initialize the ClusterSet in the leader cluster by antctl mc init
  2. Join the ClusterSet in the member cluster by antctl mc join with the token distributed by the leader cluster( the administrator).

When we want to leave the ClusterSet. We can just use antctl mc leave to leave the ClusterSet.

These commands decouple the leader and member clusters. We don't need to modify the leader cluster's resources when we add a new member. And we don't need the member list when we create a new leader. And it based on the ClusterSet auto update #3956