nephio-project / nephio

Nephio is a Kubernetes-based automation platform for deploying and managing highly distributed, interconnected workloads such as 5G Network Functions, and the underlying infrastructure on which those workloads depend.
Apache License 2.0
104 stars 53 forks source link

Declarative cluster targeting #679

Open liamfallon opened 5 months ago

liamfallon commented 5 months ago

Original issue URL: https://github.com/kptdev/kpt/issues/3387 Original issue user: https://github.com/bgrant0607 Original issue created at: 2022-07-19T19:32:42Z Original issue last updated at: 2022-11-16T02:56:12Z Original issue body: We've explored declarative cluster targeting a bunch of times in the past.

The main alternatives are:

  1. Per-resource annotations, as with Config Sync cluster selectors
  2. Targeting in the sync API, as in ArgoCD Application, or other package-level specification

An orthogonal issue is how to abstract the cluster targets and credentials to access them, which we'd want to do, especially in the case of per-resource annotations, as well as for blueprints. Kubeconfig is one option, but we may need to make it pluggable.

Per-resource targeting is flexible, but error prone. My recommendation in Config Sync has been to use kustomize or kpt to add cluster selectors to groups of resources where they are needed, but to minimize their use, generally. In Config Sync, selectors are a rendering-time operation.

We also know that we'll want to support rollout across groups of clusters. That's easier to reason about if the packages themselves do not contain targeting information.

cc @justinsb

Original issue comments: Comment user: https://github.com/johnbelamaric Comment created at: 2022-07-19T21:33:20Z Comment last updated at: 2022-07-19T21:33:20Z Comment body: The plan for this in Nephio right now (which is to say, "very early") is that targeting is done by targeting per cluster repositories, with separate packages for the workloads and for the topology / deployment specs:

One thing that needs to be worked out for this is how the workload packages expose their APIs. The topology controllers shouldn't have to understand the internals of the workload package. So, we want "package as interface", for the human consumers of the package, but there needs to be an advertisement of the automation API available for a given package.

In the Nephio case, there tend to be "types" of workloads, which different vendors can implement. So, for example, a "user plane function" is a particular type of workload that performs a particular function in a telco network. There is a set of common attributes between implementations of that function from different vendors. It is those common pieces that the topology controllers that stamp out variants need to operate on, possibly along with "demographics" about the specific clusters. The vendor-specific pieces are more likely to be configured by humans, or via a "class" style API, where one persona defines the classes and the consumer chooses among them.

What we hope to have in the end is the ability for a user to say something like: "Deploy a VendorX UPF of class 'high bandwidth, medium latency' to support 10,000 subscribers on all clusters in the US West region", and have the controllers be able to determine the necessary inputs from that.

Workload packages in this case will need to contain a specific resource that the topology controller can latch on to. That is, we need the API for automation consumption to be well defined, versioned, with compatibility guarantees. But we want to separate out just the minimum of what the automation needs in that API, and allow the rest of the package to be modified by humans (or other automations!).

One thing this implies for package authors is we need function inputs to be flexible - that is, the automation needs to talk via the custom resource, and then the package internal function inputs need to consume things from that custom resource. I know we have talked about this a number of times, but I couldn't find an issue besides #3339 which is not quite right.

Comment user: https://github.com/bgrant0607 Comment created at: 2022-07-19T21:42:40Z Comment last updated at: 2022-07-19T21:42:40Z Comment body: Composing inputs with blueprints to create specialized deployment packages is covered by #3347.

Orchestration of bulk changes is #3348.