nephio-project / nephio

Nephio is a Kubernetes-based automation platform for deploying and managing highly distributed, interconnected workloads such as 5G Network Functions, and the underlying infrastructure on which those workloads depend.
Apache License 2.0
93 stars 52 forks source link

Porch: PackageVariant Redux #622

Open liamfallon opened 2 months ago

liamfallon commented 2 months ago

Original issue URL: https://github.com/kptdev/kpt/issues/3973 Original issue user: https://github.com/johnbelamaric Original issue created at: 2023-05-23T00:10:46Z Original issue last updated at: 2024-01-02T14:09:14Z Original issue body: We have been working with the PackageVariant resource for the last couple weeks in Nephio. Functionally, we're probably OK (apart from bugs which I will work on), but I think a few improvements are warranted. Some issues:

  1. Modifying package context seems to be a bad idea. I kind of suspected this but went ahead with it anyway. However, it seems kpt and Porch both reconstruct it from scratch when cloning the package. This means that you cannot set defaults in there and have PV override them; they are destroyed and there is a bit of a delay before PV reconciles and re-adds them. I think this is just trying to use it in a way that was never intended.
  2. Config injection using the injectors mechanism is awkward. It is designed with the idea that the common use case would be "match all GVKs in the cluster with this name, and associate them to injection points by GVK". But I don't think that's the right model; I think the right model is "I have these injection points, how should I satisfy them". In other words, "this package supports these configuration APIs, how should I set the values". Currently it also means that you can not inject multiple resources of the same GVK; only a single instance per GVK is allowed (well, others are allowed but they will all receive the same values).
  3. Package context and pipeline allow the package variant author to tweak the contents of the downstream with data directly in the PV resource, but config injection only supports tweaking it by inserting contents from the cluster. This means that for simple use cases, we need at least two resources (the PV and the injected resource). It would be nice to be able to directly inject from the PV itself (though doing this for things other than configmaps may not be great).
  4. ~When PackageVariantSet generates PackageVariants, it uses hash-based naming. This can cause issues because we use the PackageVariant name as part of the function name in the pipeline.~ (fixed in https://github.com/GoogleContainerTools/kpt/pull/3986)

I think 1, 2, and 3 can all be addressed with a simplified injection mechanism that works as follows.

I think this satisfies nearly all, if not all, the original use cases of injection and package context manipulation, but in a more easily understandable way.

I would add this as a new field in PackageVariant, so we would have all the mechanisms available for a while, but the plan would be to mark packageContext and injectors as deprecated and replaced by this (maybe injection?).

~For 4), I think we can solve this with a functionQualifier string in the PipelineTemplate. For PV, we would default this to the PV name, so this is totally backward compatible. For PVS, we would set this to the PVS name by default, but also allow the user to specify it like other fields in the PackageVariantTemplate.~ (this probably is not needed with the fix made in https://github.com/GoogleContainerTools/kpt/pull/3986)

cc @henderiw @natasha41575 @mortent @SimonTheLeg

Original issue comments: Comment user: https://github.com/henderiw Comment created at: 2023-05-23T04:02:59Z Comment last updated at: 2023-05-23T04:02:59Z Comment body: Maybe PackageVariantContext iso injection (just an idea, as I am thinking of it as providing context to how we want to consume the package for this use case). It would also be nice to have this capability for offline use (meaning just with kpt). We have scenarios where we have a common package we want to be able to use before porch kick(s). E.g. creating repositories for porch where with some PackageVariantContext we can leverage the same package in different ways. E.g. have a regular repo a staging repo, etc.

Comment user: https://github.com/johnbelamaric Comment created at: 2023-05-23T15:42:43Z Comment last updated at: 2023-05-23T15:42:43Z Comment body: I'm trying to avoid package-specific APIs. The idea of injection is to allow the package to advertise how it can be configured. Another term @justinsb used for a similar concept is "binding" - so, the package has binding points you can "attach" a resource to. In that case he was dealing more with "binding" two packages together, rather than "injecting" from the cluster. These are similar concepts we may want to unite in some way, but I do think the distinction between "binding packages via an interface" and "injecting config from the cluster via an interface" are subtly different.

Comment user: https://github.com/henderiw Comment created at: 2023-05-25T03:51:01Z Comment last updated at: 2023-05-25T03:51:01Z Comment body: Yes agree I was not implying package-specific APIs. The thought is to leverage some of this in a semi offline fashion. The way I see it this allows to provide specific context to the package. Whether it is through another package or via a CR/KRM resource are all options

Comment user: https://github.com/johnbelamaric Comment created at: 2023-05-31T15:36:48Z Comment last updated at: 2023-05-31T15:36:48Z Comment body: Extending the idea above, we can also allow it to inject resources from other packages, encompassing the "binding" concept. So, it would look something like:

  injection:
    # example of injecting from the cluster and attaching to a named, in-package binding point (aka "injection point")
  - bindingPoint: binding-point-1 # we can enforce unique binding point names across all GVKs
    clusterResource:
      name: in-cluster-resource-name # GVK is implied by bindingPoint
      # eventually when we work out permissions, we can allow namespace here
      # possibly one day, even values from a *different* cluster?
  - bindingPoint: binding-point-2
    directResource:
      apiVersion: foo.bar/v1alpha1    # we can get this from the binding point, but needs thought whether we should
      kind: FooBar
      spec:
        my: value
  - bindingPoint: binding-point-3
    packageResource:
      repo: foo
      package: bar
      revision: v4 # or maybe only allow it from the most recent published revision? needs discussion
      apiVersion: foo.bar/v1alpha1    # we can get this from the binding point, but needs thought whether we should
      kind: FooBar
      name: my-resource-name

Omitting the bindingPoint would mean to insert the resource with a generated name, and to own it completely (like we do for pipeline functions) - it would be added/removed/updated by the PV controller as the PV changes.

There are some details to work out. When we know the binding point, we know the GVK; maybe we should make it optional across the different types of injections. For the package one, we have namespaces for the repo/package and for the resource inside it, so we may want to create separate structures. In general a few more iterations on the API for usability would be useful, but I think this is better than what we have.

Comment user: https://github.com/johnbelamaric Comment created at: 2023-06-26T19:38:43Z Comment last updated at: 2023-06-26T19:38:43Z Comment body: Oh, I misunderstood your PackageVariantContext thing - you were suggesting the naming instead of injection? "Context" might be a good word, yes, let's think about names that may give that flavor.

Comment user: https://github.com/johnbelamaric Comment created at: 2023-06-26T19:44:46Z Comment last updated at: 2023-06-26T19:44:46Z Comment body: Also, in Nephio we found a need to inject multiple resources of the same type. For example, we use a sentinel CR to represent a demand for a certain set of resources; we then inject that resource from other packages. Something like what is described here (though the actual implementation is a little different): https://github.com/nephio-project/nephio/pull/263/files

I think we could extend the concept described here to manage this type of dependency as well. At least, it's worth discussing if that belongs here or as a separate post-fanout controller (aka "specializer" in Nephio terminology).

Comment user: https://github.com/johnbelamaric Comment created at: 2023-06-26T20:00:27Z Comment last updated at: 2023-06-26T20:01:07Z Comment body: > "Context" might be a good word, yes, let's think about names that may give that flavor.

Another point on this, "context" here makes sense in some cases, but I wonder if there are other cases that are a slightly different idea. Things like the cluster the package is going to, the region, dev/test/stage/prod all make sense as "context".

But there are other things, like "configure the capacity of this UPF". It's not really about adjusting the workload to the context it is running in, but more about what we want out of the workload itself. I am not sure this matters at all, but I am trying to tease apart the types of configuration APIs that a package may advertise, and whether there is a meaningful pattern as to how, when (in the overall provisioning process), and by whom they are used.

Comment user: https://github.com/johnbelamaric Comment created at: 2023-09-20T15:15:17Z Comment last updated at: 2023-09-20T15:15:42Z Comment body: Another way to think about this, after a discussion with @liamfallon is to use an analogy to programming. You can think of each binding point name as a variable name. The package author is declaring a sort of function signature - you can set this variable (binding point) to a value of this type (GVK). The PV is like calling the function - you are deciding which values to assign to the variables. In this updated design, PV allows you to take the values from the cluster, from another package, or set them to a literal value. Similarly, each variable (binding point) can be required or optional. Optional ones take the default values if not set.

The key points are:

  1. Package authors get to define the typed (schema'd), versioned variable "inputs" to the package, and have a name they can use internal to access the values passed by the user.
  2. The user can pick ("bind") the input values from various places.
  3. The system can monitor the input values, and if they change, it can initiate a new Draft with the new values.

Comment user: https://github.com/johnbelamaric Comment created at: 2023-10-14T00:14:59Z Comment last updated at: 2023-10-14T00:14:59Z Comment body: Another thought based on @tliron's tko demo: should we allow (at least) the "direct" binding to be a patch as opposed to a complete resource? This allows more flexibility around defaulting.

In fact today "injection" is effectively a "patch" of the entire spec...so maybe there's something more general there.

See also https://github.com/kptdev/kpt/issues/3121 (I need to read that one thoroughly too...I think there are some gotchas in here).

Comment user: https://github.com/johnbelamaric Comment created at: 2023-10-17T19:33:13Z Comment last updated at: 2023-10-17T19:33:13Z Comment body: > It would also be nice to have this capability for offline use (meaning just with kpt). We have scenarios where we have a common package we want to be able to use before porch kick(s).

that's an interesting idea - could PV, or some kernel of PV, live in a package and be resolved client-side?

It's a bit of a tangent, but in general, I'd like to move towards APIs that can be resolved by different tools. So, for example, instead of embedding upstream information in the Kptfile, we would create an "Upstream" resource that lives in the package. It could be processed client-side by kpt, or server-side by Porch. In time, this would become a collection of package-oriented APIs that even other config tools could support. This also raises the idea of whether our current PackageRevisionResources is the right model for mapping package contents into K8s APIs. I would prefer something that makes those more "real" in the server, but of course without any actuation mechanism. This is a bigger discussion I am digressing into...I'll try to write up some thoughts in a design proposal soon.

Comment user: https://github.com/henderiw Comment created at: 2024-01-02T14:03:00Z Comment last updated at: 2024-01-02T14:03:00Z Comment body: Have been looking a bit deeper into this and looking at the pipeline changes. What would be a good capability to have here is what the intention was with package-context but apply this to a new resource and use cel-functions to set the values of the parameters.

E.g.

Comment user: https://github.com/henderiw Comment created at: 2024-01-02T14:09:13Z Comment last updated at: 2024-01-02T14:09:13Z Comment body: Also on the clusterResourceBinding point would we not use ?