nephio-project / nephio

Nephio is a Kubernetes-based automation platform for deploying and managing highly distributed, interconnected workloads such as 5G Network Functions, and the underlying infrastructure on which those workloads depend.
Apache License 2.0
102 stars 53 forks source link

Add a "Porchfile" to kpt packages that stores porch-specific metadata #783

Open kispaljr opened 1 month ago

kispaljr commented 1 month ago

TL;DR

Add a YAML file named Porchfile to kpt packages that are managed by porch to persistently store porch-specific metadata.

Overview

Porch handles all kinds of metadata that is related to a kpt package, but that is not strictly the content of the package. Historically porch used the Kptfile to persist this kind of metadata together with kpt's metadata. This was natural, since kpt and porch were parts of the same project. Recently porch was moved out from kpt, so we, the porch maintainers no longer have ownership over the Kptfile schema, but some of the new features we are trying to implement needs us to attach new kinds of metadata to the package. Hence this proposal to add our own, porch-specific metadata file, called Porchfile, to porch-managed kpt packages to store porch-specific metadata. This would keep compatibility with kpt, but allow us to add new kinds of porch-specific metadata, at the same time.

Porchfile should contain a valid KRM object, with the local-config annotation. Here is a possible example Porchfile content:

apiVersion: porch.kpt.dev/v1
kind: Porchfile
metadata:
  annotations:
    config.kubernetes.io/local-config: "true"
  name: my-kpt-package
spec:
  readinessGates:
    - req.nephio.org/v1alpha1.Interface.n3
    - workload.nephio.org/v1alpha1.NFDeployment.upf-cluster01
status:
  conditions:
  - message: update condition for initial resource
    reason: workload.nephio.org/v1alpha1.NFDeployment.upf-cluster01
    status: "False"
    type: req.nephio.org/v1alpha1.Interface.n3
  - message: update for condition
    status: "False"
    type: workload.nephio.org/v1alpha1.NFDeployment.upf-cluster01
  lastRender:
    error: ""
    result:
      exitCode: 0  

What kind of metadata should be stored in Porchfile?

Right now porch handles two kinds of package metadata:

The proposal is to store new kinds of "persitent metadata" in the Porchfile, and maybe migrate the exisisting porch-owned metadata from the Kptfile to the Porchfile to allow future schema changes.

Alternatives considered

Kptfile

A natural alternative to adding a new Porchfile would be to keep adding new fields to the Kptfile itself. However, we can only modify our own version of the Kptfile schema, that is stored in the porch codebase, and that would make porch-managed kpt packages incompatible with the kpt CLI tool. This is a problem for porch developers and users, since we still heavily utilize the kpt command for development and troubleshooting of kpt packages, e.g. to run kpt fn render on a local copy of a kpt package, or deploying a kpt package by kpt live apply.

Metadata store

Besides git/oci, porch currently has another storage backend, the so-called metadata store. But that is designed to store "ephemeral metadata", and this proposal is about the "persistent type metadata" only.

kispaljr commented 1 month ago

cc: @henderiw @tliron @johnbelamaric @Catalin-Stratulat-Ericsson @efiacor @liamfallon

henderiw commented 1 month ago

Ok lets firsts try to highlight the problem we want to solve as this is not clear from this issue.

kispaljr commented 1 month ago

I will try to rephrase it: We want to store extra information that is related to a particular kpt package, but that is not strictly its content. All such information/metadata is now stored in the Kptfile (ref to the upstream package, readinessGates, conditions, etc.). Now we encountered a situation, when we want to store a new kind of metadata needed for porch to do its job properly, but we cannot just add a new field to the Kptfile, since kpt is a separate project now and they own the schema of the Kptfile.

So, according to this proposal, porch will to create its own file (Porchfile) that is similar to Kptfile, but separate from it, and whose schema is fully owned by the porch project. This will maintain compatibility between kpt packages managed by porch and the kpt CLI command, and still allow us a place to introduce new kind of data that we want to store in a package.

henderiw commented 1 month ago

Why does it have to be a file and not just metadata to the PR CRD?

kispaljr commented 1 month ago

That is what I tried to justify in the original wording of the PR, by mentioning the use case when e.g. a vendor's blueprint repo is registered in multiple customers' porch instances. But another good example would be if you unregister a repository from porch (by deleting the Repository object) and re-register it. All labels and annotation added to the PackageRevision CRs will be lost after re-registering the repo. And this is fine, but there is types of package metadata that you don't want to lose when you unregister the repo from porch, for example: the reference to the upstream package, or the readinessGates. This kind of metadata should be stored alongside the package in git/oci. And the easiest way to add this information to git/oci is to create a file. I would say that this is reasoning behind having a Kptfile in the first place.

henderiw commented 1 month ago

There is some conflicting info here:

  1. you are talking about a blueprint, you don't want specific information stored in a common blueprint repo.

and some info specific to porch implementation.

  1. the fact you loose metadata if you deregister a repo it is a bug in porch. The PR is a way to convey information to the user about the result of things that happened. I don't believe it is good practice to store metadata in git. The fact we use the kptfile to convey information like readiness is also not the right way.
kispaljr commented 1 month ago

Yes. I agree, but the backend storage behind PackageRevision objects is git, isn't it? Similarly how normal CRs are stored in etcd.

As far as I can see, all persistent data presented via PackageRevision objects has to be somehow stored in git. Either encoded in the name of a git branch (how we store workspace names for draft/proposed packages), or in git annotations (like how store workspace names for published packages), or in files in git (that's what we do with readinessGates in Kptfile), or maybe using some other git concept, but definitely in git.

Am I missing something?

[In my understanding the metadata store in porch was designed for ephemeral data, that is known to be lost on unregistering the repo]

kispaljr commented 2 weeks ago

For the record: I completely rewrote the original proposal text in the first comment, based on the discussion with @henderiw . I hope the new version is a more concise and digestible description of what we want to add and why.

tliron commented 2 weeks ago

I support this idea and indeed implemented it in the TKO experiment, where it's used to keep track of bookkeeping metadata (URL of the original template used) as well as allowing for some optional user-controlled knobs for changing default behavior, such as disabling auto-approval (example).

I don't think "Porchfile" is the best name, though, as hopefully it can be agnostic to any specific implementation.

If we accept the basic idea we can move on to discussing its structure and what should be included.

kispaljr commented 2 weeks ago

By all means, let's find a better name for this file :)

henderiw commented 2 weeks ago

this is the proper way to solve this, rather than keep hacking. My 2 cents. https://github.com/nephio-project/nephio/issues/598

kispaljr commented 1 week ago

I see this as a different, independent problem. I assume even if PackageRevisions are CRs, we want to keep registering Repositories and auto-discovering the packages in them (i.e. there will be a PackageRevision controller that creates/updates/deletes PackageRevisions based on what is in git). If this is true, then we still have to decide where the "persistent metadata" attached to a package is stored between Repository unregistration and re-registration. In other words we have to decide where the PackageRevision controller will read the persistent metadata values from, during auto-discovery. Porchfile tries to solve the problem of persisting the kind of metadata whose lifecycle is bound to the lifecycle of a package in git (that I called "persistent metadata"), and not the kind of metadata whose lifecycle is bound to the lifecycle of a PackageRevision KRM object (that I called "ephemeral metadata").

henderiw commented 1 week ago

Ok I reread your new text, but the way to solve this is a bit different in my view and I am of the opinion porch should just care about access to git, not about any specialization, etc. As such a workspace name is a local matter, so it does not need to be persisted and it does not need to be coordinated across clusters. Wrt condiitons, etc that work is triggered from a PV and the corresponding technologies should report their status, so the reproducibility should come from that chain not porch.