nephio-project / nephio

Nephio is a Kubernetes-based automation platform for deploying and managing highly distributed, interconnected workloads such as 5G Network Functions, and the underlying infrastructure on which those workloads depend.
Apache License 2.0
102 stars 53 forks source link

Porch PackageRevisionResources: use KRM content as-is instead of encoding it into a YAML string #535

Open tliron opened 6 months ago

tliron commented 6 months ago

Our current system encodes the KRM package content as YAML. This is 1) cumbersome as it requires all clients to parse and possibly re-encode the YAML, and 2) wasteful as it requires extra encoding/decoding on clients/server.

The proposed solution is simple but non-trivial. The standard Kubernetes code generator does not support non-schematic data, however it can be bypassed with custom code for deep-copy and OpenAPI. This solution has been tested successfully in the TKO experiment.

Note that if we make this change, we will have to change all clients to use this new format. (It will make them simpler, as they won't have to decode/encode YAML.)

It's possible to support both representations by bumping the API version to v1alpha2 for the new solution. However, at this early stage in Porch's life backwards compatibility does not seem to be necessary. (Update) Another solution for supporting both is by allow for both a resources and a resourcesYaml property, and selected which one we want via field selection.

The proposed solution is to change the resources properties to use this:

import "github.com/tliron/go-ard"

// KRM package.
type Package struct {
    // KRM package contents.
    Resources []map[string]any `json:"resources,omitempty"`
}

func (self *Package) DeepCopyInto(out *Package) {
    resources := make([]map[string]any, len(self.Resources))
    for index, resource := range self.Resources {
        resources[index] = ard.Copy(resource).(map[string]any)
    }
    out.Resources = resources
}

func (self *Package) DeepCopy() *Package {
    resources := new(Package)
    self.DeepCopyInto(resources)
    return resources
}

func (_ Package) OpenAPIDefinition() openapi.OpenAPIDefinition {
    return openapi.OpenAPIDefinition{
        Schema: spec.Schema{
            SchemaProps: spec.SchemaProps{
                Description: "KRM package.",
                Type:        []string{"object"},
                Properties: map[string]spec.Schema{
                    "resources": {
                        SchemaProps: spec.SchemaProps{
                            Description: "KRM package contents.",
                            Type:        []string{"array"},
                            Items: &spec.SchemaOrArray{
                                Schema: &spec.Schema{
                                    SchemaProps: spec.SchemaProps{
                                        Description: "Resource in the KRM package.",
                                        Type:        []string{"object"},
                                    },
                                },
                            },
                        },
                    },
                },
            },
        },
    }
}
henderiw commented 6 months ago

as long as the yaml is in the package we have optimal libraries for parsing this data in a go struct. See KRM libraries functions we did in R1. As such you can easily parse and get the proper go struct or whatever structure you are interested in.

tliron commented 6 months ago

Sure, the library works, but it is currently necessary to use that library or to otherwise parse the YAML if you're not using or can't use our libraries, e.g. from Python or another language. It also makes it hard to use various tools that work on generic KRM -- all of them will have to parse the YAML in order to make it non-opaque.

This solution removes that step.

henderiw commented 6 months ago

We want the packages to be editable by humams and for this YAML is the format k8s selected. So as long as we keep this requirement I don't see why we should change this. On python we can easily add capabilities to add a python plugin if this is important.

tliron commented 6 months ago

I think you are missing the point here. My proposal is more editable by humans because it keeps the resource in actual KRM instead of embedding text into it and requiring clients to convert from/to YAML. I don't understand how you can't see this as a improvement.

Instead of this representation:

# this is a string that you need to parse!
resources: "
  - apiVersion: v1
    kind: Pod
  ..."

We can have this representation:

# not a string!
resources:
  - apiVersion: v1
    kind: Pod
  ...
henderiw commented 6 months ago

How will you retain comments, from the yaml?

tliron commented 6 months ago

This solution does not retain comments.

If you think retaining comments in PackageRevisionResources is somehow valuable, it is possible to support both methods, e.g. having both a resources property and a resourcesYaml property, and then controlling which one you get via field selection.

I don't think retaining comments here is valuable. This is part of the automation system, and comments could potentially take up a large amount of our limited space (e.g. long headers on each .yaml file we collate together into the package). If you want to work with comments, there is always the manual git approach.

henderiw commented 6 months ago

Having comments has always been a key requirement. Why would we change this? I don't see why. We should think of a package as code and also here we comment what we are trying to do. So not having this ability is going backwards which is not a good idea from a consumption pov. Lets not do this.

tliron commented 6 months ago

Is it a key requirement here, in PackageRevisionResources? This KRM is meant for the specialization pipeline, which is automation, not manual intervention. The comments are all there in git, over which this is merely a facade for automation.

The whole point of choosing git, and swallowing its tremendous costs for scalability and performance, was to allow for manual intervention there, no? I think it's far easier to work there, where you not only have comments, but also the original file names and directory structure, and even non-YAML files, like .md etc. It seems very unlikely to me that a human would be able to discern comments in YAML all flattened and shoved into PackageRevisionResources. I'm not convinced that this feature is one worth keeping at the cost of requiring YAML encoding/decoding for every interaction with PackageRevisionResources.

I recommend implementing this solution as a more client-friendly experience for clients of Porch and as a way to reduce compute overhead across the entire ecosystem.

By the way, YAML is not the natural representation in Kubernetes. The wire format is JSON, and indeed I think the storage format in etcd is JSON, too (actually a JSON-compatible binary). Which is why all KRM types in Go have to use the json tags for properties, recognized by the Go built-in JSON encoder. (Another by the way -- Go doesn't have a standard YAML encoder.)

Where YAML comes in is as a convention for user manifests, usually interpreted by the kubectl tool, which also supports JSON, by the way, which indeed was the original and more "native" (and less user-friendly) convention. Eventually YAML became the preferred manifest format. Kubernetews itself does not support round-tripping comments in YAML, indeed because YAML is never sent to the cluster, only parsed locally by kubectl. So when you do kubectl get -o yaml, it decodes fresh YAML, without comments.

henderiw commented 6 months ago

As you said this is to serve a pipeline and a pipeline w/o comments is like no comments in any code. This would be bad.

kispaljr commented 6 months ago

Just a quick clarification question: would this mean that https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1/unstructured should be used to interpret the map[string]any typed KRMs?

kispaljr commented 6 months ago

The whole point of choosing git, and swallowing its tremendous costs for scalability and performance, was to allow for manual intervention there, no?

Yes, absolutely. And I also agree that manual edits to packages will happen via the git interface, while automation processes will edit packages via the porch API. I think what @henderiw is worried about is that in order to support both methods simultaneously (to allow collaboration between humans and machines) we don't want changes made via porch to mess up the formatting of YAML files in git, because that would make them unreadable to humans. For example, let's assume that a process changed a field in a KRM object via the porch PackageRevisionResource API. Then the porch API server will have to apply this change to the YAML file in git that contains the KRM object. If that makes the containing YAML file "minified"/"flattened" in git, that effectively renders it useless for humans. All in all I also think that keeping the YAML formatting in the git interface is an important requirement, at the same time I also agree that exposing that formatting via the porch API is not important at all. So in my view the main question is: can we implement the PackageRevisionResource controller (in the porch API server) to allow edits via this new interface for programmers, while keep the formatting (whitespaces, comments, order of fields, ...) of YAML files in git? If the answer is yes, then that answers my formatting related concerns.

tliron commented 6 months ago

Just a quick clarification question: would this mean that https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1/unstructured should be used to interpret the map[string]any typed KRMs?

No, the "resources" property would be a standard Go map[string]any. Essentially identical to what you would get from parsing YAML. It simply skips the necessity to parse YAML.

kispaljr commented 6 months ago

No, the "resources" property would be a standard Go map[string]any. Essentially identical to what you would get from parsing YAML. It simply skips the necessity to parse YAML.

Just to show a 3rd alternative :) In my specializers I do not actually parse the YAML files into a list of map[string]any, but to a list of KubeObjects, because I find them easier to work with. I understand that this a matter of taste. :) Here is the relevant code: https://gist.github.com/kispaljr/da256fc1b14a5d325c77b84305568fb0 , only the signatures of the first two functions are important for this discussion.

tliron commented 6 months ago

If that makes the containing YAML file "minified"/"flattened" in git, that effectively renders it useless for humans.

So, when I say "flattened" that was a simplification. In Porch we do keep the full path of the original file of the resource. My initial example was simplistic for demonstration -- it would not be []map[string]any but rather map[string][]map[string]any or similar, where the first map key is the YAML file path in the package, and there's a list of resources in that file. Same as right now, so the "structure" is maintained. I would argue, though, that this "structure" is not human readable at all.

See this example. If a human can grok that, I would be surprised. I simply don't see humans editing a PackageRevisionResource directly.

Actually, when I said "use git" that's not the only way. You can pull the package via the CLI to a local directory, do modifications there, and push it back. But since we don't support locking the package while changing it, there's no guarantee that something else wouldn't change it between the pull/push. We don't do atomic updates via this method. So, generally speaking human intervention in editing is more for debugging than for normal workflows, which rely on automation.

So in my view the main question is: can we implement the PackageRevisionResource controller (in the porch API server) to allow edits via this new interface for programmers, while keep the formatting (whitespaces, comments, order of fields, ...) of YAML files in git? If the answer is yes, then that answers my formatting related concerns.

As for order of fields, nothing is maintained anyway under usual processing. YAML does not have ordered maps, and YAML parsers/encoders do not normally maintain order (some Python parsers do, using Python's standard ordereddict, but the Go parser we use doesn't when used normally: more on that below). For example, a kpt function that modifies a resource will likely emit the new resource with a different field order.

Actually, any kind of modification to a resource would normally result in losing all YAML comments as well as extra whitespace (e.g. empty lines). You are normally parsing it into language-local structures (maps, slices, structs, etc.), modifying, and emitting again from those structures. Emitting YAML does not normally include comments or extra whitespaces. So when would comments be potentially preserved? For files that we don't change. The result is that sometimes there are comments in some places in a specialized package, sometimes not, depending on where we've specialized. The only place where comments are preserved is the original blueprint package. Basically, comment preservation in Porch ends being accidental and inconsistent. It's not a feature you can rely on.

It is actually possible to read/write comments if you parse into the YAML parser's internal node-based structures, and you can even maintain property order, but this is very cumbersome to work with and we don't normally do it. And if we did decide to take that route, we would also have to decide on edge cases, such about what to do with comments that appear above or next to properties that you've deleted. Do you keep them? If so, they would appear wrong to a human reader, because what they were commenting on is gone. Comments in YAML are not like Go, where they are associated with the entity below them or on the same line. In YAML they are simply metadata embedded in the stream. Proper preservation of comments would require quite a complex semantic system, and I really have to wonder if doing so would be valuable for an automation system like ours.

tliron commented 6 months ago

Just to show a 3rd alternative :) In my specializers I do not actually parse the YAML files into a list of map[string]any, but to a list of KubeObjects, because I find them easier to work with.

Sure! You can unmarshal directly into Go structs if you know the resource schema. You're still parsing YAML. :)

By the way, I wrote a library that allows to you convert "unstructured" Go data (maps, slices, etc.) into structs, see here. The library generally provides useful tools for dealing with unstructured data, which in general is kinda painful to do in Go.

kispaljr commented 6 months ago

wow, a lot to answer :)

So, when I say "flattened" that was a simplification. In Porch we do keep the full path of the original file of the resource. My initial example was simplistic for demonstration -- it would not be []map[string]any but rather map[string][]map[string]any or similar, where the first map key is the YAML file path in the package, and there's a list of resources in that file.

I actually liked your initial interface (flat list of KRM resources) better. We can keep the file structure by writing the filename into the internal.config.kubernetes.io/path label and the index of the resource inside the file into the internal.config.kubernetes.io/index label (the same way as kyaml's ByteReader and ByteWriter does). But I think this is a sidetrack, not the main issue. Also by "flattening" I meant the reformatting of the YAML file, not loosing the file structure in the PackageRevisionResource , so this may be a misunderstanding.

See this example. If a human can grok that, I would be surprised. I simply don't see humans editing a PackageRevisionResource directly.

Yes, I agree with you. The PackageRevisionResource API is not for humans. I also agree that exposing the formatting via PackageRevisionResources is not important. But, as stated above, I still think that keeping as much formatting as possible in git is actually important. I will give you a new use case below to support this.

So, generally speaking human intervention in editing is more for debugging than for normal workflows, which rely on automation.

I can imagine that human intervention can be a mandatory part of a workflow. For example I have a PoC where I keep the NFTopology, the Networks, the NFInstances, NFClasses, etc. in a package in git. The whole package is hydrated by the collaboration of humans doing edits of the topology package via a GUI (that in turn uses porch), humans doing edits directly to the YAMLs via git, and automated processes (specializers) filling up empty fields of the design automatically (via porch). Drawing up such a topology form scratch I think is a good example, where the cooperation of a Network Engineer (human) is mandatory. Still this process can be supported by automated processes, as well, before the whole package is considered ready. In this case, let's say the Network Engineer comments out a whole section, because it is half-ready, and then a specializer does some static IP assignment on the Networks, and during that change it also deletes the commented out section completely. I would consider this as a bug, and the human will be angry that she has to retrieve the lost section from git history.

It is actually possible to read/write comments if you parse into the YAML parser's internal node-based structures, and you can even maintain property order, but this is very cumbersome to work with and we don't normally do it.

This is exactly what the KubeObject library that I mentioned above tries to do. It gives you an interface that is similar to the "unstructured" client, but internally it works with YAML nodes, thus tries to maintain formatting (with limited success)

And if we did decide to take that route, we would also have to decide on edge cases, such about what to do with comments that appear above or next to properties that you've deleted. Do you keep them? If so, they would appear wrong to a human reader, because what they were commenting on is gone. Comments in YAML are not like Go, where they are associated with the entity below them or on the same line. In YAML they are simply metadata embedded in the stream. Proper preservation of comments would require quite a complex semantic system, and I really have to wonder if doing so would be valuable for an automation system like ours.

You are right, there is a limit of what we can do in terms of keeping formatting. The most problematic case that I saw during Nephio R1 development was keeping the formatting of list items, even if the list is reordered, or some fields of an item has changed, etc. Still, for the reasons above I would at least try to do what is possible to keep as much of the formatting, as possible. If we can do this inside the PackageRevisionResources controller (inside the API server), meaning that we can show the interface that you proposed, and thus hiding the painful task of keeping the formatting from the users of the API , that would be wonderful.

tliron commented 6 months ago

I actually liked your initial interface (flat list of KRM resources) better.

Me, too, really. I personally think YAML is our "source code" and should essentially be discarded once it enters the system. In the same way as YAML manifests are discarded by Kubernetes/etcd, and only exist for the kubectl client. We should not have to care about the original directory structure or files, just the resource data.

In TKO it's indeed just data that can be stored in the database as CBOR. (And document databases such as MongoDB allow powerful indexing and querying of exactly such structured data.)

I would go further and say that even YAML manifest "source code" is not such a great front-end for architects. Languages such as TOSCA provide a much better "source code" for humans and GUIs. Internally everything is KRM, sure, because that's what Kubernetes speaks, but even KRM is "human readable" only for engineers with specific skills.

I can imagine that human intervention can be a mandatory part of a workflow. For example I have a PoC where I keep the NFTopology, the Networks, the NFInstances, NFClasses, etc. in a package in git. The whole package is hydrated by the collaboration of humans doing edits of the topology package via a GUI (that in turn uses porch), humans doing edits directly to the YAMLs via git, and automated processes (specializers) filling up empty fields of the design automatically (via porch). Drawing up such a topology form scratch I think is a good example, where the cooperation of a Network Engineer (human) is mandatory.

You start by describing "Day 0", the design of the blueprints, which indeed is all about humans (possibly with AI assistance). But once it's fed into automation, I have a lot of difficulty imagining humans participating. Even our current support for "approving" packages before deployment seems an unnecessary feature to me. If you're deploying 1,000,000s of workloads across 100,000s of sites, do you expect any human to validate any of these deployments for correctness? Maybe in lab environments, but not in real world use cases.

It's the same thinking that makes me very skeptical of GitOps's supposed benefits. I really can't imagine real world uses of Nephio involving anybody doing git diffs and commits on a running production system. And the price git puts on scalability as well as poor atomicity are very significant.

This is exactly what the KubeObject library that I mentioned above tries to do. It gives you an interface that is similar to the "unstructured" client, but internally it works with YAML nodes, thus tries to maintain formatting (with limited success)

Nice effort! :) But again I question our ability to do this well, and even more seriously question the necessity for this feature. I don't see humans poking into specialization. I would rather focus on the data itself and not worry about how it's presented to humans after Day 0.

Back to our issue, I have always been unhappy with this YAML decoding/encoding for every interaction with PackageRevisionResources. It is a unnecessary waste of compute resources.

henderiw commented 6 months ago

We always wanted the ability for humans to manipulate the result after the hydration/specialisation and this requirement remains critical as the automation can also make mistakes or things can change The fact we can handle this through a source control is an important characteristic we should not give up and I know a number of customers like this aspect lot.

tliron commented 6 months ago

Allowing user approval or even manipulation is something that can happen whatever the backend is, it's not a feature unique to git or VCS. One just needs to provide a UI over it. The role of comments in YAML for this feature is not clear to me at all. The original comments were in the blueprint, but if we maintain these comments during specialization, would they even be relevant in the specialized result? Actually, they can be misleading -- a comment that was true for the blueprint might be completely wrong after a kpt function changed a value. E.g. a comment like # Enable buffered transport would remain, even if the kpt function disabled properties related to buffered transport. A human would be very confused. And I'm not even getting into the issues I mentioned above regarding the challenge of keeping YAML comments "close" to whatever it is they are commenting about: a property? a section? something else? Comments in YAML are not associated with YAML elements, they are free floating.

So I think we are talking about a few different things here:

1) Allowing for user changes after specialization. Regarding this, I don't have strong opposition, because I think it has value in lab work, but I do question the value for customers in production. I don't have a clear user story for a customer looking into a whole bunch of complex KRM and being able to discern what is correct and what isn't for a particular site. Add massive scale, and it becomes even less realistic. But I understand the customer need to maintain the possibility of control. Again, no strong opposition from me for this feature. In essence it's simple to implement: we have the data, we can show it to users and let them approve. In Porch we handle it in a rather complex manner via a separate approval controller. In TKO I just added a column to the backend (an indexed column in SQL).

2) Do we need to maintain comments in YAML for the above feature? I can't think of a way to do so reliably, both from a technical perspective—keeping YAML comments in the right place despite manipulating the KRM—but perhaps even more importantly from a designer-intent perspective—are the comments relevant or misleading after specialization changes, even if we can find a way to keep them in the "right" place? Again, I remain very skeptical about this idea of humans dealing with this data, let alone massive amounts of it. If we want to make this feature a first-class experience, dragging along the original YAML comments from the blueprint does not seem to me to be helpful, and in some cases it can make things worse.

3) Do we need git for any of this? No, we do not. Data, whether it's stored in textual YAML or some other format, can be stored anywhere and presented to a human user for approval or manual manipulation.

Actually, there is another point I want to make here: We try hard to preserve the file and directory structure in git. But while that structure could make sense for the blueprint, it might not always make sense during specialization. For example, imagine a kpt function that generates a lot of KRM from a single resource. Where should that KRM live in the file/directory structure? It might make sense to put it in the same file as that one resource. But ... maybe the designer organized files and directories in a different way. For example, maybe all the networking stuff should sit under networking/. The kpt function has no idea what organization the designer intended. So, the resulting specialized package might be a disorganized mess.

This goes back to my repeated point that Day 1+ packages intentionally diverge from Day 0 blueprints. What started out as a human design has now become machine data.

henderiw commented 6 months ago

No we are not talking about different things here

  1. we always wanted manual or automated approval and the reason for manual is exactly to have the ability to change things, if not done properly. Comments are valuable since they can express context and the person validating can get info from this. Similar to code
  2. yes see bullet 1.
  3. none of what we discuss is depending on git, so this not relevant on the new point it is irrelevant to the discussion and is a decision of the pipeline. We are talking about the context the data gets.

Whether the comments are there or not has no effect on the user experience. Even it has an opposite effect. We can provide views with or without the data. But if the info is not there we cannot serve all scenario's, so the user experience is will be degraded if the info is not there.

kispaljr commented 6 months ago

It seems to me that we diverged far from the specification of the PackageRevisionResources API. So I propose to discuss the much broader Nephio strategic questions that came up with a much wider audience, i.e. on Slack, GitHub discussions and in SIG meetings, etc. And come back to this technical topic after we have the answer to those underlying strategic questions.

kispaljr commented 6 months ago

I know that we deliberately leave this thread be, until the community figures out the bigger strategic goals (i.e. how do we want to support human involvement in hydration), and I agree with this approach. So I put here my proposal only "for the record", and not to immediately reopen the topic.

My proposal would be to do the YAML parsing in the porch API server, as @tliron suggested, but instead of a list of map[string]any we should present a list of KubeObjects to the consumers of PackageRevisionResources. Actually there is a named type for a list of KubeObjects, unimaginatively called KubeObjects with handy methods. So PackageRevisionResourcesSpec would look like this:

import "github.com/GoogleContainerTools/kpt-functions-sdk/go/fn"

type PackageRevisionResourcesSpec struct {
    // [...] unmodified fields

    // KRM package contents.
    Resources fn.KubeObjects `json:"resources,omitempty"`
}

My rationale is that "list of KubeObjects" provides a more comfortable interface to work with than map[string]any (I admit this is a matter of taste), and it also at least makes it possible to keep formatting in git, while parsing into plain go maps would deny even the possibility to do so.

tliron commented 6 months ago

Thanks @kispaljr , I would say another option for Go would be to use Unstructured, because it enjoys some deep support inside the Go client libraries (meta accessors, etc.). But, anyway, this is just for Go, every client language/framework would use whatever they need to manipulate the package data.

Generally using Go with KRM tends to be painful! But it's the "native" language of Kubernetes and has the best support.

henderiw commented 6 months ago

The big advantage of raw data is that it allows for mutiple possibilities. Let's say we want to take some data into a catalog. We could do this w/o any formatting and allows for extendable/flexible catalog mgmt + the usage between client and server side is agnostic as well.

liamfallon commented 4 months ago

Triaged Triage Comment: Part of Porch rearchitecture