Package dependencies: expressing ("my package requires") and fulfilling ("my package provides")

johnbelamaric commented 2 years ago

Describe your problem

As we have discussed a few times, packages often have dependencies. They may depend on particular resources existing in the cluster, or they may depend on availability of services running on an endpoint, or may have other types of dependencies. This issue is a spin off of #3351, to provide a single place to capture examples of these dependencies and discuss ways of managing them within kpt. It is not the same as depends-on, which is about apply ordering within a package's resources.

Some examples:

A package needs a namespace in which to be provisioned.
A package needs a particular CRD to be installed.
A package needs a Secret that must be provisioned out-of-band, OR satisfied via an ExternalSecret.
A package needs a value from the runtime instantiation of another package (e.g., a load balancer IP address).
A package needs a particular service running (e.g., a database), which may be provisioned in-cluster or out-of-cluster.
A package needs a particular shared resource to be available; for example, a shared Ingress Gateway.

Each of these may be satisfied in different ways, and some of them may need to be satisfied by a user with different privilege levels than the package deployer (e.g., deploying a namespace vs into a namespace, or deploying a CRD vs a CR).

We would like to be able to express the requirement a package has, without the package having an opinion about how the requirement is met. For example, if an application package needs a namespace, that could be provisioned manually, or via deployment of a namespace package, or could be added by the deployer to the application package itself. The application package needs to be able to simply say "I need a namespace", and then the Kpt suite of tools needs to identify whether or not that dependency has been satisfied, at the earliest possible time in the package lifecycle (see #3422).

To that end, packages also need a way to say "I satisfy some requirement". This allows the tooling to understand and match the "package requires" and "package provides". However, we need to figure out how that is expressed in each layer of the solution. For example, at runtime this model may be expressed using a "Claim" pattern; it's not clear that works at Config Time.

Another thought is how this relates to the idea of "mixins". One might consider something like Environment (#3280) as being expressed as a dependency - the package depends on some Environment package/mixin being deployed. The user would choose which (dev/stage/prod) one to use to resolve that dependency.

johnbelamaric commented 2 years ago

@droot @yuwenma @bgrant0607 @justinsb

droot commented 2 years ago

Thanks for starting this thread @johnbelamaric

A package needs a value from the runtime instantiation of another package (e.g., a load balancer IP address).

Just a note: This one is different. This is more like a runtime dependency mixed with a value propagating from the runtime state (sounds like apply-time dependency).

johnbelamaric commented 2 years ago

Just a note: This one is different. This is more like a runtime dependency mixed with a value propagating from the runtime state (sounds like apply-time dependency).

Yes, I'd like to capture various types and discuss strategies for each.

bgrant0607 commented 2 years ago

Do we have an example of value propagation that's not an IP address? IP addresses can be reduced to the service case.

droot commented 2 years ago

Slightly different but a common scenario:

I have an application package that needs CloudSQL resource. And, CloudSQL resource is satisfied by another infra package that uses KCC resources (Config Connector). The unique thing about this scenario is that the cloudSQL (infra) package will be deployed in the KCC cluster (admin cluster is the common term) while the application package will be deployed in the application workload cluster. So this is a cross-cluster package dependency.

johnbelamaric commented 2 years ago

Do we have an example of value propagation that's not an IP address? IP addresses can be reduced to the service case.

Service is indirection, and is a good solution to this value propagation when there is something to resolve the indirection (DNS in the case of Service).

There are examples of other things that have to be allocated out of a common pool. Some network-y things I can think of: ports (e.g., NodePort), VLAN IDs, maybe VRF names or BGP community strings or AS numbers. Some of these may be allocated on a per-cluster basis (NodePort) at runtime, but some may need to exist across clusters or in some other scope (organizational, regional, etc.). Those likely become inputs to multiple packages, and there is no simple protocol for serving them up via indirection.

Other ideas: I can imagine there are applications that (though not a great design) share a common database, for example. IP address or even IP:Port is not sufficient; they also need schema name. Those could be automatically generated or allocated.

One question: could we simplify this by requiring those allocations to be represented somehow by a package? So that all dependencies are just package dependencies? Probably not a good idea, it may cause too much package sprawl. Resource-level might be better. Or maybe representing the dependency itself as a resource.

johnbelamaric commented 2 years ago

Those likely become inputs to multiple packages, and there is no simple protocol for serving them up via indirection.

While this is true, there actually are ways to deal with this. You can use DNS TXT records (or the Kube API server, for that matter) to make this an indirection. However, applications do not natively understand that - most applications can't even use DNS SRV records (IP and port), much less mapping a TXT into a database connection string of some sort. So, you then need an init-container or runtime controller (in the workload cluster) that can look up what you need via that mechanism and rewrite the config. This can work but is kludgy and requires workload changes to get it to work. I think we can do better.

johnbelamaric commented 2 years ago

@justinsb showed a concept of "binding" objects that may satisfy some of the dependency use cases. This is a resource within the package that "advertises" itself as a sort of placeholder. In the particular use case Justin discussed, this was a namespace annotated as a "local-config: binding". Since it was a local-config it wouldn't end up in the cluster, but it does serve to let consumers of the package know that they need to somehow provision a Namespace.

In Nephio controller PoC demo, I used a ClusterScaleProfile CR that lived in the package to advertise that the package was able to accept this type of object from the context and have a function scale it based on the contents of that CR. This is similar to the binding concept, except that it is implicit in the type. It may be better to make it explicit with an annotation; although, in this case, the resource present in the package serves as default values in case the deployment context doesn't have that resource. So we may want a different annotation value, since it's not mandatory that a binding happens.

In that demo, I have a fan-out controller that injects the ClusterScaleProfile based on an association with the target cluster. It doesn't really need to be "cluster" - a more accurate way to describe this is "deployment context". In a discussion on the WI operator, I asked about making that a function rather than a controller. A similar question can be asked of the way I inject ClusterScaleProfile in the Nephio demo - I do it with a controller, but perhaps it could be done automatically by Porch as part of function input gathering (#3396)?

For simplicity, let's assume a 1:1 ratio of deployment repositories:clusters. We could adjust this to handle 1:N or even N:1 or N:M, but I don't think the basic idea changes materially in that case. Here's how it could work in this case:

A deployment context (or many different CRs representing different types of context) are attached (by name, labels, or reference; TBD) to a deployment repository. For example, the GCP context object containing the project ID for the cluster reading from that deployment repository, or a ClusterScaleProfile as in the Nephio PoC.
The package contains binding objects of various types, some of which must be resolved ("project id"), and others which serve as defaults (a la ClusterScaleProfile). The annotation value probably needs to identify which are which.
Porch understands these are bindings, and so when cloning (or really, when saving) to a deployment repo - before running the function pipeline - it looks for associated objects of those types, and injects them, overwriting (or patching?) the ones in the package. In the Nephio PoC, I copied just the Spec object, leaving the metadata in place.
Those context objects serve as various function inputs, so now when the function pipeline runs, it will take appropriate actions: annotating the KSA in the case of the WI operator, and scaling deployments and configmaps in the case of the ClusterScaleProfile.

Notes & Issues:

Namespace - the original example for "bindings" - is not a repo or cluster level context, so this doesn't really apply to that type of dependency.
We probably want to tie this to Conditions still; I think these are complimentary concepts. A function (or kpt/Porch) can automatically calculate that required contextual inputs are a Condition, and decide if they Condition is met based on whether the deployment repo has an associated contextual input of the correct type.
We may find use cases where multiple contextual objects of the same type exist. I can't think of any offhand, but package composition is likely to create this sort of case. If these come up, then type is insufficient for targeting to the bindings, we may need labels or names or something.
One issue is what happens when the context objects change. For example, if we increase the capacity of the target cluster. In that case, Porch needs to be watching those objects and proposing changes. I guess that declarative case could be handled separately, by a controller, and not be the default behavior on clone.

johnbelamaric commented 2 years ago

Since we are cataloging dependency-related thoughts here I figured I would include this from @BernardTsai-DT in a Nephio discussion. I thought this was an interesting set of categorizations to think about.

Dependencies for that purpose need to be categorized, e.g.:

A is hosted on B: a solution component can only be installed within the context of other solution components (for example a network function in a k8s cluster)

A is managed by B: a solution component is managed by another solution component (for example the data and control plane of network function)

A is served by B: a solution component makes use of services provided by another solution component (for example a web API server and a database backend)

A is clustered with B, C. ...: here several soution components provide the same service but for availability reasons are clustered (for example a database cluster)

These are the bread-and-butter dependencies which would probably address 80% of the typical dependencies to be considered. For more complex dependencies I would then recommend to make use of special controllers which would reflect more complex constraints, e.g. reconfiguration of a 5G network slice which would implicitely require the introduction of a different solution architecture (for example customer upgrades his network slice from standard shared setup to a setup with dedicated edge UPF).

yuwenma commented 2 years ago

Some examples from porch dogfooding (it uses Porch UI to create CNRM GCP project, cluster, etc and and install porch to the new cluster in the other project):

deployment-tolerant requester
- "environment" package(s) is the "provider"
- "gkeCluster" package is the "requester"
- The "gkeCluster" does not require "environment" package to be deployed first. This is unlike the "namespace" provisioning where "namespace" has to be deployed first.
server-side "live" provider
- "gkeCluster" package is the "provider"
- "Porch" package is the "requester"
- The porch package requests the the cluster to be deployed and have some field data updated by the server, specifically the CA Certificate (see dogfooding user guide step 5.8). This basically requests a "live" resource. We may want to discuss whether porch should support it or not.
"passive" requester
- higher privilege team provide the "IAM" role package(s)
- lower privilege team needs the "IAM" to be preset.
- "IAM" role package(s) are set by people with higher privilege. So it normally goes with the "provider" package and it is out of the "requester" package control, or even the "requester" does not know what IAM role they need. This is a passive "require-provide" scenario. One big challenge is that how we can validate the IAM role on the downstream "requester" side (via KRM function?) since all key data is hold by the upstream "provider".

johnbelamaric commented 1 year ago

Just as an FYI, we are looking towards doing something with respect to this in the Nephio R1 timeframe.

Quoting what I write in https://docs.google.com/document/d/14PYu1Y6h1IXRwuhSY3CKCiuprsduOvMp_Bb4fVGhFD0/edit#:

We can deliver this in the Porch or in Nephio, it’s up to us - but I suspect if we build it in Nephio we may want to eventually upstream it to Porch, it is quite general purpose in its utility.

This is a set of CRD to represent some basic dependencies, and a controller that can propose additional packages to fulfill those dependencies.

Some of those dependencies could be explicit, and some implicit. Explicit dependencies must be declared by the package author; implicit dependencies may be discovered by the system by examining the package contents.

Some implicit examples:

If a package contains namespaced resources, the namespace must exist.
If a package contains a CR, the CRD must be loaded in the cluster.

Some explicit examples:

A package may need some other service (e.g., a database), but does not embed that service within itself.
A package may need a Secret that is provisioned out-of-band.

Each dependency may be resolved in many different ways. For example, a namespace resource could already exist in the destination cluster, or we could add the resource directly in the package, or we could propose a separate package to be deployed that will provision the namespace. We’ll need to figure out how the controller decides (i.e., how we specify policies) which of these mechanisms to use to resolve a given dependency.

The same conditions mechanism we used for IPAM can be used for dependency management, with each dependency representing a condition that must be resolved before we can approve the package for deployment.

kptdev / kpt

Package dependencies: expressing ("my package requires") and fulfilling ("my package provides") #3448

Describe your problem