kubernetes-retired / cluster-api-provider-nested

Cluster API Provider for Nested Clusters
Apache License 2.0
301 stars 67 forks source link

Proposal: define the CRD of nested components #11

Closed charleszheng44 closed 3 years ago

charleszheng44 commented 3 years ago
  1. Define the CRD of the nested components
  2. Define a standard way of creating nested components
charleszheng44 commented 3 years ago

/resolve #7

brightzheng100 commented 3 years ago

It'd be perfect if we could illustrate where the CRs and instances are created/placed, especially under the context of multiple tenant / nested clusters are supported, in the architectural diagrams.

charleszheng44 commented 3 years ago

@brightzheng100 thanks for the suggestion. I will edit the proposal and explain where each CR will be placed.

In short: The NCPT is a cluster scope CRD, the NCP will be placed in the namespace specified by the end-users, and the Nested CRs will located in the clusternamespace created by the NCP controller.

christopherhein commented 3 years ago

Original message:

In short: The NCPT is a cluster scope CRD, the NCP will be placed in the namespace specified by the end-users, and the Nested CRs will located in the clusternamespace created by the NCP controller.

See my suggestions as well, I wonder if we could implement this part slightly different for NCPs vs how it was done in VC.

christopherhein commented 3 years ago

@Fei-Guo did you mean to edit my message? 😛

christopherhein commented 3 years ago

From @Fei-Guo with regards to my message about NCPT

See my suggestions as well, I wonder if we could implement this part slightly different for NCPs vs how it was done in VC.

If we plan to let user create CRs by himself, it is likely he has to to deal with setting proper owner references for all the CRs. Let NCP controller handle CR creations will ease user.

Anyway, if we design a mechanism to identify all the associated components, we may be able to workaround the owner reference limitation. All we need is that each component has a way to find other components.

christopherhein commented 3 years ago

If we plan to let user create CRs by himself, it is likely he has to to deal with setting proper owner references for all the CRs. Let NCP controller handle CR creations will ease user.

Totally, I think we need to have some way, like using a controlPlaneRef on each CR component https://github.com/kubernetes-sigs/cluster-api-provider-nested/pull/11/files/4b6d8aea98ed04eea543f2e29319567d4bab5207#diff-1002f0da6945eba89a0115988309b5629a0466ae80661df7859daf855f6ea9f1R277 which ties it back to the owning NCP. This would be easy for a user to setup. I agree on ease of use, but concerned about making something a central orchestrator as the NCP when that is probably better suited for higher-level abstractions.

I'm worried if we add too much automation upfront then we'll have a difficult time rolling it back, it would be relatively easy to eventually add "creating the CRs" in a NestedControlPlaneTemplate controller later down the line to add that same type of higher-level abstractions back into CAPN.

Fei-Guo commented 3 years ago

Yes. ControllerplanRef works which is exactly in the proposal here. How to find associated CRs from NCP? We propose to add objectRef in NCP status. Do you agree or have some other ideas?

christopherhein commented 3 years ago

Yes. ControllerplanRef works which is exactly in the proposal here. How to find associated CRs from NCP? We propose to add objectRef in NCP status. Do you agree or have some other ideas?

We could use metadata.OwnerReferences and in the NCP controller configure a custom cache index for each of the CRs to make the look up fast. this is predicated on https://github.com/kubernetes-sigs/cluster-api-provider-nested/pull/11#discussion_r521671865

charleszheng44 commented 3 years ago

Yes. ControllerplanRef works which is exactly in the proposal here. How to find associated CRs from NCP? We propose to add objectRef in NCP status. Do you agree or have some other ideas?

We could use metadata.OwnerReferences and in the NCP controller configure a custom cache index for each of the CRs to make the look up fast. this is predicated on #11 (comment)

I think the problem is how can one CR controller get another CR, for example, KAS controller may wanna to get the etcd address that is stored in the Etcd CR. We could use the metadata.OwnerReference to get the NCP CR but I am not sure how can we use a custom cache index in the NCP controller to help the KAS controller to get the corresponding Etcd CR. But we can use the Selector field in the NCP status to retrieve corresponding CRs as mentiond in the previous comment

vincepri commented 3 years ago

@christopherhein @Fei-Guo @charleszheng44 Are there concrete use cases that we have today for having multiple CR references inside linked to the main resource? It seems from prior conversations (been catching up today) that all of these resources are effectively linked together and we might want to reconsider having a pluggable abstraction layer until later on.

In other words, consider starting from a (mild) opinionated approach that doesn't allow pluggable controllers, we could always get there with time and as use cases occur, but for now it seems we can iterate an learn by having a simplified model.

What do you folks think?

christopherhein commented 3 years ago

@christopherhein @Fei-Guo @charleszheng44 Are there concrete use cases that we have today for having multiple CR references inside linked to the main resource? It seems from prior conversations (been catching up today) that all of these resources are effectively linked together and we might want to reconsider having a pluggable abstraction layer until later on.

In other words, consider starting from a (mild) opinionated approach that doesn't allow pluggable controllers, we could always get there with time and as use cases occur, but for now it seems we can iterate an learn by having a simplified model.

What do you folks think?

The thought process with individual controllers for each component is it allows us to build more stable upgrade flows for nested control planes. If we have each component use standard labels and statuses we can have etcd staying stable at one version while the apiserver or controller manager is being in-place upgraded through standard kubernetes mechanisms. What you are referencing is much like what we have with ClusterVersion's in today's VC which starts to break down as you need to have more customized control planes and upgrades, for example how we have the ability to change flags on a per control plane basis and my big use case for "pluggability" (which is sort of a hack in how I'd like to see it) relies on someone else building their own controller that supports Nested* and just turns off the in-tree controler this is to support the prior art for these components which I don't think CAPN needs to fully own, specifically etcd.

I agree to start smaller and opinionated is better upfront and the in-tree providers should be able to do this just requiring the user to create those CRs when bringing up the cluster eg. Create NC, NCP, NEtcd, NKAS and NCM and just reference back to the owning objects, for the most part, those should be nearly blank unless you need to override something specific like adding a FeatureGate.

WDYT? Would this cause concerns?

vincepri commented 3 years ago

This all makes sense, thanks for providing some more color.

Create NC, NCP, NEtcd, NKAS and NCM and just reference back to the owning objects, for the most part, those should be nearly blank unless you need to override something specific like adding a FeatureGate.

If the references are left blank, we could have the main controller to default-create the in-tree controllers on its own, further simplifying the UX. We could also make sure of controller owner references to signal that these CR are managed by the main controller

christopherhein commented 3 years ago

This all makes sense, thanks for providing some more color.

Create NC, NCP, NEtcd, NKAS and NCM and just reference back to the owning objects, for the most part, those should be nearly blank unless you need to override something specific like adding a FeatureGate.

If the references are left blank, we could have the main controller to default-create the in-tree controllers on its own, further simplifying the UX. We could also make sure of controller owner references to signal that these CR are managed by the main controller

Yea, we discussed this a little bit on the last call. I agree it could be created up-front by the NCP although we'd need a way to reconcile since in-theory we'd have someone kubectl apply -f ./ the list of resources and it might create NCP before NEtcd since we don't have a "stack-like" object so if NCP created NEtcd and then the local NEtcd CR was applied it could fail to be created because of conflicts, maybe SSA would help.

My initial thought was since we eventually plan to bring back NCPT's we could use that as the higher-level orchestration/abstraction and it could create those CRs and the control plane namespace to give the same virtualcluster/clusterversion style experience but doesn't force us to manage those flows in the low-level component controllers.

Have any suggestions, is there prior implementations doing something like this we could use as a reference?

vincepri commented 3 years ago

Yea, we discussed this a little bit on the last call. I agree it could be created up-front by the NCP although we'd need a way to reconcile since in-theory we'd have someone kubectl apply -f ./ the list of resources and it might create NCP before NEtcd since we don't have a "stack-like" object so if NCP created NEtcd and then the local NEtcd CR was applied it could fail to be created because of conflicts, maybe SSA would help.

I might be missing something (also getting lost with all the acronyms :)), if a user kubectl apply a bunch of yaml, and they have a custom provider for NestedEtcd, I'd expect them to setup the ObjectReference in NCP correctly? Otherwise, as you said, the NCP controller is going to default it and create one on its own

christopherhein commented 3 years ago

I might be missing something (also getting lost with all the acronyms :)), if a user kubectl apply a bunch of yaml, and they have a custom provider for NestedEtcd, I'd expect them to setup the ObjectReference in NCP correctly? Otherwise, as you said, the NCP controller is going to default it and create one on its own

Doh' sorry I'm going off the expectation that there wasn't a bidirectional references between NCP to N* components, you are right that would solve the problem, I was going off the expectation that the NCP didn't have those and the component CRs were the only thing referencing back to an NCP.

If NCP has:

    // EtcdProvider specifies which EtcdProvider will be used to create the Etcd
    // +optional
    EtcdProvider EtcdProvider `json:"etcdProvider,omitempty"`

    // ApiServerProvider specifies which KASProvider will be used to create the kube-apiserver
    // +optional
    ApiServerProvider KASProvider `json:"apiServerProvider,omitempty"`

    // ControllerManagerProvider specifies which KCMProvider will be used to create the kube-controller-manager
    // +optional
    ControllerManagerProvider KCMProvider `json:"controllerManagerProvider,omitempty"`

and each Nested Component CR has an OwnerReference that would suffice. Sorry for the confusion.

christopherhein commented 3 years ago

Once these last things are cleared up can you also squash the commits and we can get this merged and broken down to individual issues.

charleszheng44 commented 3 years ago

Once these last things are cleared up can you also squash the commits and we can get this merged and broken down to individual issues.

Yes, just squashed the commits and create a new PR for moving terms to glossary #17 .

christopherhein commented 3 years ago

Nice work @charleszheng44, I'm good to go on this.

/approve

christopherhein commented 3 years ago

I would like someone else to give it lgtm. Just to have more consensus before.

christopherhein commented 3 years ago

/kind design

christopherhein commented 3 years ago

/priority important-soon

k8s-ci-robot commented 3 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: charleszheng44, christopherhein

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/master/OWNERS)~~ [christopherhein] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment