nephio-project / nephio

Nephio is a Kubernetes-based automation platform for deploying and managing highly distributed, interconnected workloads such as 5G Network Functions, and the underlying infrastructure on which those workloads depend.
Apache License 2.0
93 stars 52 forks source link

Package dependencies: expressing ("my package requires") and fulfilling ("my package provides") #653

Open liamfallon opened 2 months ago

liamfallon commented 2 months ago

Original issue URL: https://github.com/kptdev/kpt/issues/3448 Original issue user: https://github.com/johnbelamaric Original issue created at: 2022-08-08T19:47:16Z Original issue last updated at: 2023-02-08T18:24:41Z Original issue body: ### Describe your problem

As we have discussed a few times, packages often have dependencies. They may depend on particular resources existing in the cluster, or they may depend on availability of services running on an endpoint, or may have other types of dependencies. This issue is a spin off of #3351, to provide a single place to capture examples of these dependencies and discuss ways of managing them within kpt. It is not the same as depends-on, which is about apply ordering within a package's resources.

Some examples:

Each of these may be satisfied in different ways, and some of them may need to be satisfied by a user with different privilege levels than the package deployer (e.g., deploying a namespace vs into a namespace, or deploying a CRD vs a CR).

We would like to be able to express the requirement a package has, without the package having an opinion about how the requirement is met. For example, if an application package needs a namespace, that could be provisioned manually, or via deployment of a namespace package, or could be added by the deployer to the application package itself. The application package needs to be able to simply say "I need a namespace", and then the Kpt suite of tools needs to identify whether or not that dependency has been satisfied, at the earliest possible time in the package lifecycle (see #3422).

To that end, packages also need a way to say "I satisfy some requirement". This allows the tooling to understand and match the "package requires" and "package provides". However, we need to figure out how that is expressed in each layer of the solution. For example, at runtime this model may be expressed using a "Claim" pattern; it's not clear that works at Config Time.

Another thought is how this relates to the idea of "mixins". One might consider something like Environment (#3280) as being expressed as a dependency - the package depends on some Environment package/mixin being deployed. The user would choose which (dev/stage/prod) one to use to resolve that dependency.

Original issue comments: Comment user: https://github.com/johnbelamaric Comment created at: 2022-08-08T19:47:36Z Comment last updated at: 2022-08-08T19:47:36Z Comment body: @droot @yuwenma @bgrant0607 @justinsb

Comment user: https://github.com/droot Comment created at: 2022-08-08T19:59:46Z Comment last updated at: 2022-08-08T19:59:46Z Comment body: Thanks for starting this thread @johnbelamaric

A package needs a value from the runtime instantiation of another package (e.g., a load balancer IP address).

Just a note: This one is different. This is more like a runtime dependency mixed with a value propagating from the runtime state (sounds like apply-time dependency).

Comment user: https://github.com/johnbelamaric Comment created at: 2022-08-08T20:00:56Z Comment last updated at: 2022-08-08T20:00:56Z Comment body: > Just a note: This one is different. This is more like a runtime dependency mixed with a value propagating from the runtime state (sounds like apply-time dependency).

Yes, I'd like to capture various types and discuss strategies for each.

Comment user: https://github.com/bgrant0607 Comment created at: 2022-08-09T03:20:11Z Comment last updated at: 2022-08-09T03:20:11Z Comment body: Do we have an example of value propagation that's not an IP address? IP addresses can be reduced to the service case.

Comment user: https://github.com/droot Comment created at: 2022-08-09T19:29:49Z Comment last updated at: 2022-08-09T19:29:49Z Comment body: Slightly different but a common scenario:

I have an application package that needs CloudSQL resource. And, CloudSQL resource is satisfied by another infra package that uses KCC resources (Config Connector). The unique thing about this scenario is that the cloudSQL (infra) package will be deployed in the KCC cluster (admin cluster is the common term) while the application package will be deployed in the application workload cluster. So this is a cross-cluster package dependency.

Comment user: https://github.com/johnbelamaric Comment created at: 2022-08-10T19:50:29Z Comment last updated at: 2022-08-11T14:09:38Z Comment body: > Do we have an example of value propagation that's not an IP address? IP addresses can be reduced to the service case.

Service is indirection, and is a good solution to this value propagation when there is something to resolve the indirection (DNS in the case of Service).

There are examples of other things that have to be allocated out of a common pool. Some network-y things I can think of: ports (e.g., NodePort), VLAN IDs, maybe VRF names or BGP community strings or AS numbers. Some of these may be allocated on a per-cluster basis (NodePort) at runtime, but some may need to exist across clusters or in some other scope (organizational, regional, etc.). Those likely become inputs to multiple packages, and there is no simple protocol for serving them up via indirection.

Other ideas: I can imagine there are applications that (though not a great design) share a common database, for example. IP address or even IP:Port is not sufficient; they also need schema name. Those could be automatically generated or allocated.

One question: could we simplify this by requiring those allocations to be represented somehow by a package? So that all dependencies are just package dependencies? Probably not a good idea, it may cause too much package sprawl. Resource-level might be better. Or maybe representing the dependency itself as a resource.

Comment user: https://github.com/johnbelamaric Comment created at: 2022-08-10T23:28:04Z Comment last updated at: 2022-08-10T23:28:04Z Comment body: > Those likely become inputs to multiple packages, and there is no simple protocol for serving them up via indirection.

While this is true, there actually are ways to deal with this. You can use DNS TXT records (or the Kube API server, for that matter) to make this an indirection. However, applications do not natively understand that - most applications can't even use DNS SRV records (IP and port), much less mapping a TXT into a database connection string of some sort. So, you then need an init-container or runtime controller (in the workload cluster) that can look up what you need via that mechanism and rewrite the config. This can work but is kludgy and requires workload changes to get it to work. I think we can do better.

Comment user: https://github.com/johnbelamaric Comment created at: 2022-08-18T17:04:04Z Comment last updated at: 2022-08-18T17:04:04Z Comment body: @justinsb showed a concept of "binding" objects that may satisfy some of the dependency use cases. This is a resource within the package that "advertises" itself as a sort of placeholder. In the particular use case Justin discussed, this was a namespace annotated as a "local-config: binding". Since it was a local-config it wouldn't end up in the cluster, but it does serve to let consumers of the package know that they need to somehow provision a Namespace.

In Nephio controller PoC demo, I used a ClusterScaleProfile CR that lived in the package to advertise that the package was able to accept this type of object from the context and have a function scale it based on the contents of that CR. This is similar to the binding concept, except that it is implicit in the type. It may be better to make it explicit with an annotation; although, in this case, the resource present in the package serves as default values in case the deployment context doesn't have that resource. So we may want a different annotation value, since it's not mandatory that a binding happens.

In that demo, I have a fan-out controller that injects the ClusterScaleProfile based on an association with the target cluster. It doesn't really need to be "cluster" - a more accurate way to describe this is "deployment context". In a discussion on the WI operator, I asked about making that a function rather than a controller. A similar question can be asked of the way I inject ClusterScaleProfile in the Nephio demo - I do it with a controller, but perhaps it could be done automatically by Porch as part of function input gathering (#3396)?

For simplicity, let's assume a 1:1 ratio of deployment repositories:clusters. We could adjust this to handle 1:N or even N:1 or N:M, but I don't think the basic idea changes materially in that case. Here's how it could work in this case:

Notes & Issues:

Comment user: https://github.com/johnbelamaric Comment created at: 2022-09-02T20:46:15Z Comment last updated at: 2022-09-02T20:46:15Z Comment body: Since we are cataloging dependency-related thoughts here I figured I would include this from @BernardTsai-DT in a Nephio discussion. I thought this was an interesting set of categorizations to think about.

Dependencies for that purpose need to be categorized, e.g.:

  • A is hosted on B: a solution component can only be installed within the context of other solution components (for example a network function in a k8s cluster)
  • A is managed by B: a solution component is managed by another solution component (for example the data and control plane of network function)
  • A is served by B: a solution component makes use of services provided by another solution component (for example a web API server and a database backend)
  • A is clustered with B, C. ...: here several soution components provide the same service but for availability reasons are clustered (for example a database cluster)

These are the bread-and-butter dependencies which would probably address 80% of the typical dependencies to be considered. For more complex dependencies I would then recommend to make use of special controllers which would reflect more complex constraints, e.g. reconfiguration of a 5G network slice which would implicitely require the introduction of a different solution architecture (for example customer upgrades his network slice from standard shared setup to a setup with dedicated edge UPF).

Comment user: https://github.com/yuwenma Comment created at: 2022-09-29T19:26:49Z Comment last updated at: 2022-09-29T20:11:09Z Comment body: Some examples from porch dogfooding (it uses Porch UI to create CNRM GCP project, cluster, etc and and install porch to the new cluster in the other project):

  1. deployment-tolerant requester
    • "environment" package(s) is the "provider"
    • "gkeCluster" package is the "requester"
    • The "gkeCluster" does not require "environment" package to be deployed first. This is unlike the "namespace" provisioning where "namespace" has to be deployed first.
  2. server-side "live" provider
    • "gkeCluster" package is the "provider"
    • "Porch" package is the "requester"
    • The porch package requests the the cluster to be deployed and have some field data updated by the server, specifically the CA Certificate (see dogfooding user guide step 5.8). This basically requests a "live" resource. We may want to discuss whether porch should support it or not.
  3. "passive" requester
    • higher privilege team provide the "IAM" role package(s)
    • lower privilege team needs the "IAM" to be preset.
    • "IAM" role package(s) are set by people with higher privilege. So it normally goes with the "provider" package and it is out of the "requester" package control, or even the "requester" does not know what IAM role they need. This is a passive "require-provide" scenario. One big challenge is that how we can validate the IAM role on the downstream "requester" side (via KRM function?) since all key data is hold by the upstream "provider".

Comment user: https://github.com/johnbelamaric Comment created at: 2023-02-08T18:24:41Z Comment last updated at: 2023-02-08T18:24:41Z Comment body: Just as an FYI, we are looking towards doing something with respect to this in the Nephio R1 timeframe.

Quoting what I write in https://docs.google.com/document/d/14PYu1Y6h1IXRwuhSY3CKCiuprsduOvMp_Bb4fVGhFD0/edit#:

We can deliver this in the Porch or in Nephio, it’s up to us - but I suspect if we build it in Nephio we may want to eventually upstream it to Porch, it is quite general purpose in its utility.

This is a set of CRD to represent some basic dependencies, and a controller that can propose additional packages to fulfill those dependencies.

Some of those dependencies could be explicit, and some implicit. Explicit dependencies must be declared by the package author; implicit dependencies may be discovered by the system by examining the package contents.

Some implicit examples:

Some explicit examples:

Each dependency may be resolved in many different ways. For example, a namespace resource could already exist in the destination cluster, or we could add the resource directly in the package, or we could propose a separate package to be deployed that will provision the namespace. We’ll need to figure out how the controller decides (i.e., how we specify policies) which of these mechanisms to use to resolve a given dependency.

The same conditions mechanism we used for IPAM can be used for dependency management, with each dependency representing a condition that must be resolved before we can approve the package for deployment.