Refactoring `func.yaml` file to follow a CRD format

salaboy commented 2 years ago

Currently, the func.yaml file describes a Func project and it follows its own conventions and schema for defining and validating the available configuration values.

By refactoring func.yaml to a CRD-like resource we can:

Reuse existing Kubernetes API machinery to write & validate the configuration
If custom metadata wants to be added to the Kubernetes resources generated by the func CLI we can follow the same principles as other Kubernetes resources do.
On-ramp users to the Kubernetes world, there are a number of other projects following this approach like for example KinD, Keptn and Jenkins X. Func configuration files feel similar to other Kubernetes resources, which will be created by the func CLI
It enables the idea of having a Kubernetes Controller to deal with these resources in the future, or it makes transformations easier to deal with Kubernetes tools, for example using mutation web hooks.
By having a section for mutable fields (status), we establish a clear separation between configuration details and data is is updated by the func CLI

The following is an example of what this refactoring would look like:

apiVersion: func.knative.dev/v1alpha1 // instead of 0.19.0?
kind: Function
metadata: 
  name: level-1
  namespace: "default"
spec:
  runtime: springboot
  registry: "" // This should come from a global configuration? 
  invocation:
    format: http
  build: git
    git:
      url: https://github.com/salaboy/level-1.git
      revision: main
    builder: paketobuildpacks/builder:tiny
    builders:
      default: paketobuildpacks/builder:tiny
    buildpacks: []
    buildEnvs:
    - name: BP_NATIVE_IMAGE
       value: "true"
  template:
    spec: 
      image: salaboy/level-1/level-1:latest
      volumes: []
      env: []
      annotations: {}
      options: {}
      labels: []
      healthEndpoints:
        liveness: /health/liveness
        readiness: /health/readiness
status: // All modifications done by the CLI when updating state should go in status
  imageDigest: ""
  created: 2022-02-01T10:10:13.540854Z

@lance @evankanderson @zroubalik @matejvasek @xtreme-sameer-vohra (and community in general) feedback is highly appreciated. I've created the issue to see if we can discuss this in our weekly meeting tomorrow. Still in Draft, as I think that there are more advantages than the one listed here, wdyt?

dolfolife commented 2 years ago

I love the concept of making func.yaml more K8s aligned. I think it is natural to read files that feel familiar rather than learning a new structure/schema.

The only question I have is the aggregation of how a function is going to be built and its runtime structure.

I see benefits in keeping them separate and developing those for the best use of their domain.

For example, since we are using buildpacks and that means we already have a project descriptor project.toml. Most of the information for the build time is specified here. Having the func.yaml and project.toml might confuse people.

I wonder if we should keep the build parts in a project.toml instead of func.yaml For example, using source-url instead of the git attribute of the func.yaml.

Other than that question, I really like this idea of making func.yaml more k8s-feeling

evankanderson commented 2 years ago

It's pretty common for kubernetes controllers to use parts of the spec to generate sub-objects, which could include both a project.toml and a Knative Service. This seems reasonable, and is, in fact, how Knative Services compose a Route and a Configuration.

With that said, it feels like there are some aspects of the proposed YAML that shouldn't be part of the yaml:

registry feels like it should be an external (environmental) parameter -- if a function is being copied between a developer's workspace and a (shared) production environment, it feels awkward to have to overwrite the registry parameter from func.yaml.
The git reference in spec.build again feels like an external parameter -- if I fork a function, it seems surprising that building the function wouldn't use my fork but instead the original repo.

evankanderson commented 2 years ago

With respect to item 5, writing status locally to file seems like an interesting question, particularly in a git-operated world; would the status be committed back to git? What if I'm operating on a local clone that isn't updated to the latest?

Having func.yaml be read-write from the tool (rather than having a separate status file) also makes it hard to use .gitignore to avoid storing this state when committing, particularly if you've also changed the spec data.

salaboy commented 2 years ago

All good questions and @evankanderson you make a point on some of the fields like registry and git... I was just trying to map the existing fields to the new proposed format. I was thinking that registry should external yes.. in the kube world that would come from a configMap or something similar.

Since I started working with func I had this issue with locally modifying the func.yaml file, I think this is needed for the local dev loop, but it gets complicated when we have remote environments and source control. If we want to keep that behaviour, I would say that status will be pushed to git as we will push the changes in func.yaml today.

lance commented 2 years ago

I like this proposal, and also some of the other suggestions such as embracing project.toml to describe the build parameters. It would really ease pipeline integration with external tooling, imo. But I don't think project.toml should replace func.yaml (I know that's not what is being suggested - just being clear). Having a build descriptor external to the project descriptor would also allow for a future that supports additional build strategies besides pack.

evankanderson commented 2 years ago

I don't love TOML, so having func.yaml have a YAML equivalent of the necessary settings makes sense to me.

evankanderson commented 2 years ago

Two further comments:

Adding apiVersion and kind makes it easy to detect if we've been handed a "wrong" YAML file, and makes it easier to iterate by adding explicit conversion steps rather than trying to "smear" field value changes.
I'm not sure whether we need a full set of metadata (e.g Kind doesn't have metadata, but does have apiVersion and kind). That's not a "no", but more of a "maybe".

lkingland commented 2 years ago

evankanderson: writing status locally to file seems like an interesting question, particularly in a git-operated world

Definitely right. Transient status information that is not correct to be checked into git is currently being written to the directory ./.func in a Function, and this directory is added to .gitignore. This should include anything that is necessary for allowing the user to take actions, but does not alter the state of the Function.

We are of course accustomed to thinking of a repository as a reusable library, but as Evan points out, we are in a git-operated world where the Function codebase is Infrastructure as Code. The repository in which func.yaml is committed is the source-controlled representation of the intent of a deployed Function. Hopefully in the not too distant future a git push will trigger a redeploy rather than requiring a manual func deploy to manually enforce the synchronization!

My guess is we will need an Operator for that, for which this proposal might be a good first step towards the CRD it uses.

evankanderson commented 2 years ago

It might be interesting to look at e.g. helm post-processing or argo for some examples.

Single-disclaimer: VMware Tanzu does a lot of additional composable thinking about converting source code to a deployable artifact, so I may have commercial conflicts.

Double-disclaimer: I'm nervous about assuming that func as a program / interface will know best about how to get a user's code to production. Between me requirements like SBOM, signing, code and image vulnerability scanning, and possible future requirements, an end-to-end solution is probably going to fall outside the functions purview for larger customers rubbing a gitops workflow. We should think about how to give them components that they can compare with their other tools.

lkingland commented 2 years ago

evankanderson:

registry feels like it should be an external (environmental) parameter ...

The git reference in spec.build again feels like an external parameter

Agreed, and these are further examples of whyfunc.yaml is slightly different than a Function CRD; though it could certainly be the source from which one is generated. It's the state of Function metadata which, in conjunction with the source code and assets themselves, constitute a fully serialized Function suitable to be thought of as the instance's source: the repository.

Generating a CRD from the current state of a Function's git-ops style repository seems reasonable. For example, to enable the automated re-build and deploy of Functions when a CVE is released for an underlying image (opt-in of course, your comment on creating composable pieces being a point well-taken). Perhaps this proposal is a good start for a CRD that is generated from the Function's current state?

lkingland commented 2 years ago

Providing an end-to-end out-of-the-box solution created from composable pieces internally indeed seems the best approach. Wait, does that make me a systemd apologist?

lkingland commented 2 years ago

apiVersion: func.knative.dev/v1alpha1 // instead of 0.19.0?

For a little background, in function.go:

type Function struct {
    // Version at which this function is known to be compatible.
    // More specifically, it is the highest migration which has been applied.
    // For details see the .Migrated() and .Migrate() methods.
    Version string `yaml:"version"` // semver format

This is used when applying the automated migrations (func.yaml is an autogenerated file) when a Function is loaded from disk into memory and it depends on the semver format when doing comparisons. So these two may not be compatible without some deeper changes or using separate fields.

xtreme-sameer-vohra commented 2 years ago

It enables the idea of having a Kubernetes Controller to deal with these resources in the future, or it makes transformations easier to deal with Kubernetes tools, for example using mutation web hooks.

The CRD approach definitely helps in this regard. As an example, this would make it easy to extend the Sugar controller to do some automation around Trigger/Broker creation on behalf of functions -> driven by labeling/annotations.

salaboy commented 2 years ago

@evankanderson @lkingland @lance We can talk about this in today's meeting. I feel confident to maybe go the next step and think through all these comments and suggestions and comeback with a proposal that we can link here.. but first let's catch up later today in the weekly meeting.

salaboy commented 2 years ago

For my own records and so I don't forget:

As mentioned in the func WG meeting, a good starting point might be to create a command for func-cli to transform the func.yaml to a CRD like resource (some kind of export functionality). This would let us show the value and also test some assumptions.

Having the use case of transforming a CRD Like resources using a mutation webhook for a supply-chain tool to build and deploy our project can be the main reason for this new format. Which also will keep things more composable.

zroubalik commented 2 years ago

I really like the structured func.yaml - it is much more clear from UX. I like it.

If we are going to do this kind of big change, then I think that we should add one thing that would improve func.yaml a lot - support for comments in the file. This would probably require some custom marshaller, but I think that we should do this.

salaboy commented 2 years ago

Document for proposal: https://docs.google.com/document/d/188xgLIXBTZZFdFL26JpVksPmQ1g-ADyAXe0EZDoQ3W4/edit#heading=h.pvswm4g8giw9 , ping me if you don't have access

salaboy commented 2 years ago

@evankanderson @lance @lkingland @dolfolife @zroubalik @matejvasek It will be great to have some kind of voting between the different proposals for me to proceed.

I would love to hear concerns about having multiple resources with more targeted behaviours in contrast with keeping things as they are today.. a single file for build and runtime, which might complicate things even when we think about local vs on cluster build (you can only have one configuration today)

evankanderson commented 2 years ago

I added a proposal #3 (and maybe 3.5) which defines the FunctionSpec as the union of FunctionBuildSpec and either ConfigurationSpec or RevisionSpec.

I think there are some advantages of storing the spec in a shape which aligns with the underlying resource definitions on Kubernetes; the chief one is that it provides an on-ramp for users to learn about the underlying resources and resource model.

salaboy commented 2 years ago

I will try to get a draft PR with Proposal #3 so we can start reviewing the details

zroubalik commented 2 years ago

salaboy commented 2 years ago

I've created a very draft PR with the new structs as defined in proposal 3, https://github.com/knative-sandbox/kn-plugin-func/pull/898, I've added into the PR description the two formats and how the export command is working.

I do see 2 big challenges to move this forward:

Move marshallers to use: "sigs.k8s.io/yaml"
Use ConfigurationSpec from v1 "knative.dev/serving/pkg/apis/serving/v1" to define the runtime configurations of the functions. This is great because we reuse the same template, but we need to stop defining our own types.. for example for Env variables, Volumes, Time, etc and use Knative or core Kubernetes types.

Even with these two big challenges, it feel that this is the right step forward, as aligning with Knative types and using the same marshallers will reduce the mismatch between projects. Comments and feedback are more than welcome.

zroubalik commented 2 years ago

Use ConfigurationSpec from v1 "knative.dev/serving/pkg/apis/serving/v1" to define the runtime configurations of the functions. This is great because we reuse the same template, but we need to stop defining our own types.. for example for Env variables, Volumes, Time, etc and use Knative or core Kubernetes types.

Isn't this a step back? If I recall correctly, that's exactly something we were trying to avoid, to hide the complexity and YAML editing from users. I agree that sharing specs is in general the right approach, I am not sure if it applies to this as well.

salaboy commented 2 years ago

@zroubalik maybe I am missing some context here about previous conversations. If you check the PR https://github.com/knative-sandbox/kn-plugin-func/pull/898 you can see both files side by side. I do agree, the second YAML file has more indentation, as the structs are more complex. Something that was discussed and is specified in the document is that at least the YAML files and conventions are closer to Kubernetes, hence for someone who is used to working with Kubernetes and for Kubernetes tooling this new format is more natural.

I do see a lot of advantages in using something like ConfigurationSpec, as it is basically allowing us to configure all the runtime configurations for a function, without pushing us to define the structure to do that in a different way, which we will need to maintain forever. If we support ConfigurationSpec there are no more things to add. Another important advantage is that empty fields don't need to be marshalled and can be omitted to allow us to have very concise configuration files. So maybe, to simplify the YAML what we need is to concentrate on sensible defaults. I don't see the structure being really complicated for this simple example, I might be too biased, because I've been using Kubernetes for too long now:

apiVersion: func.knative.dev/v1alpha1
kind: Functions
metadata:
  annotations:
    invocation: cloudevent
    runtime: go
    version: 0.19.0
  name: func-test
spec:
  build:
    build: local
    builder: gcr.io/paketo-buildpacks/builder:base
    git: {}
  template:
    spec:
      containers:
      - image: docker.io/salaboy/func-test:latest

evankanderson commented 2 years ago

Use ConfigurationSpec from v1 "knative.dev/serving/pkg/apis/serving/v1" to define the runtime configurations of the functions. This is great because we reuse the same template, but we need to stop defining our own types.. for example for Env variables, Volumes, Time, etc and use Knative or core Kubernetes types.

Isn't this a step back? If I recall correctly, that's exactly something we were trying to avoid, to hide the complexity and YAML editing from users. I agree that sharing specs is in general the right approach, I am not sure if it applies to this as well.

I don't think we should re-use the Service CR. I do think that re-using one of the existing types will make it easier for users to transition between Knative YAML resources and Func YAML (in both directions), which makes it an attractive on-ramp.

My suggestion would be to use one of RevisionTemplateSpec (the sole parameter in Configuration or RevisionSpec (and have a different way of supplying the correct object metadata, at least in terms of annotations and labels)

evankanderson commented 2 years ago

... that empty fields don't need to be marshalled and can be omitted to allow us to have very concise configuration files. So maybe, to simplify the YAML what we need is to concentrate on sensible defaults.

This has been very useful for Knative; as Matt Moore points out, a valid Knative spec fits in a single tweet:

apiVersion: http://serving.knative.dev/v1
kind: Service
metadata:
  name: foo
spec:
  template:
    spec:
      containers:
      - image: ko://github.com/mattmoor/grpc-example/cmd/server

apiVersion: func.knative.dev/v1alpha1
kind: Functions
metadata:
  annotations:
    invocation: cloudevent
    runtime: go
    version: 0.19.0
  name: func-test
spec:
  build:
    build: local
    builder: gcr.io/paketo-buildpacks/builder:base
    git: {}
  template:
    spec:
      containers:
      - image: docker.io/salaboy/func-test:latest

I'm assuming that we have the image parameter here to give an idea of where the container image should end up, but if we could generate that automatically, we could cut off the template block altogether for simple cases.

I'm also not 100% on using annotations to indicate invocation style and so forth; where would we see something like the autoscaling.knative.dev/min-scale annotation represented on the function?

lance commented 2 years ago

I'm assuming that we have the image parameter here to give an idea of where the container image should end up, but if we could generate that automatically, we could cut off the template block altogether for simple cases.

We do already derive the full image name, however it's typically only done once when the function is built the first time. https://github.com/knative-sandbox/kn-plugin-func/blob/00d5a8272284ea40ebeefa4f22f12c2d375aadae/function.go#L280-L293

We would still need to store some information such as the registry, and func.yaml has of course been the place where we do that.

I'm also not 100% on using annotations to indicate invocation style and so forth; where would we see something like the autoscaling.knative.dev/min-scale annotation represented on the function?

As it is today, there is an options field that enables this, but that's not really addressed in this proposal. I suppose it could also be an annotation.

https://github.com/knative-sandbox/kn-plugin-func/blob/main/docs/guides/func_yaml.md#options

zroubalik commented 2 years ago

TLDR: I do like the proposed solution in this issue description, I don't like the approach with ConfigurationSpec.

The "problem" I have with this proposal is, that we are basically going the opposite direction than we originally wanted to go -> We wanted to create a solution for developer and users, who would like to just implement a business logic and those who are not interested in Kubernetes and Docker (Containers). We wanted to "hide" all the yaml engineering, k8s concepts etc, that users need to know, all the machinery needed for the creations of containers and what not.

We shouldn't forget that there's is still a huge number of devs, who are not familiar with all these concepts as we do and who are not interested in learning those at all. They just want to implement a microservice in a Java/Javascript/foo_language in a framework that they are familiar and comfortable with.

To give you a specific example:

setting env variables, mounting Secrets/ConfigMap -> we introduced a very simple and easy way to do that: https://github.com/knative-sandbox/kn-plugin-func/blob/main/docs/guides/func_yaml.md#envs, the k8s approach requires more yaml editing
knative autoscaling options: again, we have a one place where you need to set those: https://github.com/knative-sandbox/kn-plugin-func/blob/main/docs/guides/func_yaml.md#options
etc

The func.yaml file was introduced as a place for users to specify needed options for a function and to host some function related metadata that we need to store. IMHO it shouldn't invent yet another CRD with all the stuff that's really not interesting for users.

I absolutely support the intention to refactor the func.yaml file and give it some structure. But we shouldn't overengineer this and produce big long CRD.

evankanderson commented 2 years ago

We would still need to store some information such as the registry, and func.yaml has of course been the place where we do that.

I'm wondering whether the registry should be part of the function definition; it seems plausible that you might build and push a function to different registry locations for local-ish development and production deployment.

It seems like we might want a func.yaml for invariants (like build pack or resource requirements) and an env.yaml or the like for per-environment settings. This would allow checking in and sharing the func.yaml file across team members and deployments, while keeping per-environment settings in a separate file either outside source control or as a set of source-controlled files.

evankanderson commented 2 years ago

To be clear, I think the ideal func.yaml is zero lines, and I don't want to make users enter a lot of boilerplate or even have to read it. However, if users do have to interact with it, I'd love to get a little bit of reuse and familiarity with Kubernetes rather then reinventing the wheel.

I'm guessing for min/max scale, the objection was that the default Knative annotations are long and namespace, which might be a valid complaint -- if so, let's see if there are good ways to tackle that while still giving power users the ability to integrate their function with other Kubernetes tools that might be leveraging annotations.

salaboy commented 2 years ago

@evankanderson to your point if we can aim to have a minimal func.yaml file following the Kubernetes Resource-style, we can always use the CLI to modify it, also following other Kuberrnetes CLI, but this instead of changing remote resources, will change the local func.yaml file.

So for a minimal resource like this one:

apiVersion: func.knative.dev/v1alpha1
kind: Functions
metadata:
  name: func-test
spec:
  build:
    git: http://github.com/salaboy/func-test
  template:
    spec:
      containers:
      - image: docker.io/salaboy/func-test:latest

We can default all the other values, and auto-detect the kind of build based on the set properties.

For scaling and complex annotations, we can always use something like:

func config set-min-scale 3 // We will need to define the right way to do this

The same can be achieved with IDE tooling and because we will have a schema for the file in VSCode and other IDEs you can have autocompletion.

zroubalik commented 2 years ago

template: spec: containers:

image: docker.io/salaboy/func-test:latest

^^ this specific part is the example of what I don't like about this approach. Why do we need to have there that complex structure? (ie template.spec.container.image). I understand the reason from k8s-world pov, but we should really try to look at it from Users POV.

To give you an example: Let's say I am a Java developer and I would like to specify some property (eg. foo) in the config, and I think that is should be somewhere in this part of config, but where should I put it? To template.foo or to template.spec.foo or to template.spec.containers.foo or to template.spec.bar.foo etc. 🤷‍♂️ So eventually I will have to refer to the documentation, is it not obvious. This is the struggle that users often have with long complex YAMLs, imho.

IMHO, let's design the stuff so it is pleasant for users in the first place, the main design decisions shouldn't be based on the internal technical aspect or some "technical purity".

CC @nainaz @lance @lkingland @rhuss

salaboy commented 2 years ago

@zroubalik I totally get your point on complexity, but I believe that the core of the issue is that we are using a YAML file and configuring a Knative Service. I am trying to find the right balance and minimum overhead to allow users to provide configurations for the resulting service. If we have our own structs to make the life of the user simple, we will limit on the configuration side.

By using just a RevisionTemplate, instead of ConfigurationSpec, for example we get something like:

apiVersion: func.knative.dev/v1alpha1
kind: Functions
metadata:
  name: level-1
spec:
  build:
    buildEnvs:
    - Name: BP_NATIVE_IMAGE
      Value: "false"
    git: http://github.com/salaboy/func-test
  runtime: 
    containers:
    - image: docker.io/salaboy/level-1:latest

I like the spec.build and spec.runtime separation, but I have conflicting thoughts about confusing developers. I do believe that having a similarity with a Kubernetes resource will help in this scenario.

For the options documented here: https://github.com/knative-sandbox/kn-plugin-func/blob/main/docs/guides/func_yaml.md#options

I think that we can provide different alternatives for example: 1) Simplified annotations (metadata.annotations), following what is documented https://github.com/knative-sandbox/kn-plugin-func/blob/main/docs/guides/func_yaml.md#options, We can translate these simplified annotations into the right object fields when we read the func.yaml file 2) Allow the trained user to enter a ConfigurationSpec to configure the Knative Service

salaboy commented 2 years ago

@zroubalik I am trying not to go for the tecnical purity approach for sure.. That's why the original proposal was all about exporting the existing format into a new format that is closer to K8s. If we have import, export we can fine tune the more advanced configurations in a more complex format with an opt-in approach, instead of forcing the user to go with the complex approach by default.

If we have the user in mind, we also need to try to make sure that we don't limit the possible configurations just for the sake of simplicity. If the user wants to configure something that can be configured in a Knative Service and it can't he/she will get frustrated.

I don't know if we can come up with a format that is simple, but can allow users to provide complex configurations without maintaining our own internal format. Hence the export/import might be the first step to move the conversation forward, if we get enough feedback of people using the export because X or Y, then we can make it the default format.

Don't take me wrong, I love the healthy discussion here, as finding the right approach on my own is not going to happen.

zroubalik commented 2 years ago

Yeah, absolutely agree that it is good that we have this discussion :)

If we have the user in mind, we also need to try to make sure that we don't limit the possible configurations just for the sake of simplicity. If the user wants to configure something that can be configured in a Knative Service and it can't he/she will get frustrated.

Do you have any specific example?

salaboy commented 2 years ago

As discussed in yesterday meeting, the group agreed on two things (please correct me if I am wrong):

Refactoring of the func.yaml file into a simple but ordered structure (this should look like a CRD, but without reusing complex structures for now).. the effort from the beggining was to keep this file simple building a higher-level abstraction away from Kubernetes and containers. We are not looking for feature parity here.
A second "feature" to add is exporting the Knative Service that is created by func into a file which can be used to provide more advanced configurations at the Knative Service level. If the file is present in the func project, then the file takes precedence over the configurations specified in the func.yaml file (this is related to the deploy operation)

salaboy commented 2 years ago

@lkingland @lance @zroubalik @dolfolife after a couple of iterations looking at how to do the refactoring I've arrived at the following structure:

version: X.X.X
name: myfunc
invocation: 
  format: http
build: 
  // all build related parameters here
run: 
  // all run parameters here

Nothing surprising here, but even trying to apply this "small" change has a large impact on the CLI. A more complete example:

version: 0.0.0
name: testfunc
created: 2022-05-25T22:44:47.36886+09:00
invocation:
  format: http
build: 
  build: local
  git: {}
  builder: ""
  buildpacks:
  - paketo-buildpacks/go-dist
  - ghcr.io/boson-project/go-function-buildpack:tip
  buildEnvs: []
run: 
  namespace: ""
  runtime: go
  registry: ""
  image: ""
  imageDigest: ""
  volumes: []
  envs: []
  annotations: {}
  options: {}
  labels: []
  healthEndpoints:
    liveness: /health/liveness
    readiness: /health/readiness

Before moving forward with all the changes I would like to try to answer some questions to make sure that I will not spend too much time going in the wrong direction:

We are doing this to make sure that moving forward 1.0 we have a more clear structure for the func.yaml file. Are we all in agreement with this change? Is there any other low-hanging fruit that we might include in this change?
I've checked the functions_migrations_unit_test.go and I started wondering if we want to keep that for 1.0 or not? Since we are changing the file structure we have a bunch of options:
- Create a completely separate structure to support migrations (this means that we will have a new Function struct for the new structure 😅 )
- Make the changes in the Function struct and maintain compatibility moving forwards only. This means keeping the migration mechanisms for future changes but stopping supporting the old structure (we can create a separate migration for 0.x to 1.0 format, I would do it in a separate place and step from this refactoring)
- Have a hybrid struct that supports both formats (my brain stops when I think about this option, but it is still an option)

Wdyt folks? Comments/feedback appreciated :)

zroubalik commented 2 years ago

+1, really looking forward to have more streamlined function config.

Generaly looking good, my only concern is this section:

  runtime: go
  registry: ""
  image: ""
  imageDigest: ""

Shouldn't it be part of build rather than run?

As for the migration, the ideal approach would be to migrate any old func config version to this new version with any func command. I wouldn't keep support for both formats.

salaboy commented 2 years ago

As per @lkingland's suggestion, things that don't belong to "build" or "run" shouldn't be forced to any, hence these properties that might be used by both build and run will be kept at the top level, as shown in the example below:

version: 0.0.0
name: testfunc
created: 2022-05-25T22:44:47.36886+09:00
runtime: go
invocation:
  format: http
registry: ""
image: ""
imageDigest: ""
build: 
  type: local
  git: {}
  builder: ""
  buildpacks:
  - paketo-buildpacks/go-dist
  - ghcr.io/boson-project/go-function-buildpack:tip
  buildEnvs: []
run: 
  namespace: ""
  volumes: []
  envs: []
  annotations: {}
  options: {}
  labels: []
  healthEndpoints:
    liveness: /health/liveness
    readiness: /health/readiness

Notice also, that it makes a lot of sense to rename build.build to build.type as the go property was already called BuildType but it was serialized as build since it was a top-level field.

This new structure takes us to a place where if we ever want to build a Kubernetes Operator that looks into one or more CRD to build and run functions we can take specific sections of the func.yaml file and make the transformation in a more streamlined way.

I do see the parallelism between the run section to a Kubernetes Resource that is running in the cluster, to keep it simple Run maps to an instance of a Knative Service. The build section maps to a PipelineRun instance and this got me thinking about if we really want to push the idea for func.yaml to be the description of a running (or a built) instance. Because if we do there is much more that we can do on that front.

I will keep working on a draft PR (now I am solving issues with tests) but hopefully we can jump into reviewing changes soon.

salaboy commented 2 years ago

On a completely different note.. and I am guessing that this conversation happed before.. but context will be appreciated: Do we need to render empty properties? My guess. is that we are rendering these properties:

 volumes: []
  envs: []
  annotations: {}
  options: {}
  labels: []

To help the user know what is available, but my instinct tells me that for newcomers having a shorter file will provide a better and less confusing experience.

version: 0.0.0
name: testfunc
created: 2022-05-25T22:44:47.36886+09:00
runtime: go
invocation:
  format: http
build: 
  type: local
run: 
  healthEndpoints:
    liveness: /health/liveness
    readiness: /health/readiness

Any opinions about this? I will be more than happy with someone telling me.. we already discussed that and a decision was made to render all empty properties anyways.

zroubalik commented 2 years ago

I can see the reasoning behind the decisions you have made on the fields related to image/registry, though I am still not happy with them being in the root of the config 😄

If we want to capture a resource running in the cluster, can we do it something like have a section for specifying a build, then configuration of the function and finallly the running state/deployment?:

TLDR:

rename run -> `config
introduce run or deploy sections and put image/digest related stuff there

I am not sure about the naming of those sections though.

version: 0.0.0
name: testfunc
created: 2022-05-25T22:44:47.36886+09:00
runtime: go
invocation:
  format: http
build: 
  type: local
  git: {}
  builder: ""
  buildpacks:
  - paketo-buildpacks/go-dist
  - ghcr.io/boson-project/go-function-buildpack:tip
  buildEnvs: []
config: 
  namespace: ""    /// <-  `run/deploy` might be a better place for this?
  volumes: []
  envs: []
  annotations: {}
  options: {}
  labels: []
  healthEndpoints:
    liveness: /health/liveness
    readiness: /health/readiness
run/deploy:
  registry: ""
  image: ""
  imageDigest: ""

salaboy commented 2 years ago

@zroubalik I get your point; apologies for the delay in answering this. I was trying to understand all the pieces involved in a change like this before focusing on the structure.

After checking with @lkingland, I do understand now the need for mapping the new sections (build, run, deploy) to the func commands. By doing this, we are linking the commands that can be executed with the parameters that the user can tune for the specific lifecycle phase of the function.

If we follow that approach of having sections in the func.yaml file associated with the commands, the structure that makes more sense to me is something like this:

version: 0.0.0
name: testfunc
created: 2022-05-25T22:44:47.36886+09:00
runtime: go
invocation:
  format: http
buildSpec: // we cannot use `build`, as this will break the unmarshalling 
  type: local
  git: {}
  builder: ""
  buildpacks:
  - paketo-buildpacks/go-dist
  - ghcr.io/boson-project/go-function-buildpack:tip
  buildEnvs: []
run:
  registry: ""
  image: ""
  imageDigest: ""
  volumes: []
  envs: []
deploy:
  namespace: ""  
  annotations: {}
  options: {}
  labels: []
  healthEndpoints:
    liveness: /health/liveness
    readiness: /health/readiness

If we consider that there is a logical relationship between build -> run -> deploy I find it ok to for example, read from the run section when doing a deploy (for example to grab the environment variables to be applied on deploy). It will be also ok, IMHO, to have two envs[] sections.. one in run and one in deploy.

@lance @zroubalik @dolfolife @lkingland does this make sense to you all? I will add this topic to our next meeting but feedback in advance will be appreciated.

PS: something that I've discovered while writing the migrations to change the structure is that if we want to change the format of the build field (meaning replacing a string for a struct), we will be breaking the unmarshalling of the function struct. That's why in the example, the field is called buildSpec. I am not entirely sure if we can escape that with the current migration method (that runs after the unmarshalling happens). I'm trying to figure out if a pre/post-migration can help with that, to avoid breaking backward compatibility. Feedback/help is highly appreciated.

salaboy commented 2 years ago

@zroubalik I wanted to make sure that I've covered your suggestion, let me know if I did or if I missed something. At this point I am convinced that mapping to the func commands makes a lot of sense from a functions developer perspective. Hence including run makes a lot of sense, but I do see labels, annotations, etc all related to just the deploy phase, hence having a config section doesn't make much sense to me.

salaboy commented 2 years ago

I would like to ask for feedback about: registry and image.

When I think of these properties from the point of view of where are they first used, I tend to think that this is build related. The build command will use the func name to create the image, and if it has push enabled, it will need the registry property. I would leave the imageDigest in the run section, which can be reused by deploy.

But based on @zroubalik comments and others, this doesn't seem to be a common point of view, and usually, these two properties are more related to run and deploy. While I understand that my point of view is not the most common one, I would love to get some feedback about why these properties should go in run or deploy so we start building a common understanding of our decisions.

daisy-ycguo commented 2 years ago

Hi. I just wonder if we should have a common way to config the handler function name, e.g., <module>.<function>. Because when I want to use a customized handler function name, I notice different runtime templates use different way to config and some templates even don't allow that.

Please correct me if I understand wrong, I didn't find these instructions in the documentation anyway.

lance commented 2 years ago

Any opinions about this? I will be more than happy with someone telling me.. we already discussed that and a decision was made to render all empty properties anyways.

I much prefer to keep empty properties out of func.yaml where at all possible.

It will be also ok, IMHO, to have two envs[] sections.. one in run and one in deploy.

My instinct is to disagree with this and to support the idea that deploy is OK to use the values from run. I can see how a user might want different values for the same environment variable depending if the function is running locally or not, however. What about having support for envs[] in both sections, inheriting everything from run in deploy, and allowing deploy name/values to override run name values.

I am convinced that mapping to the func commands makes a lot of sense from a functions developer perspective. Hence including run makes a lot of sense, but I do see labels, annotations, etc all related to just the deploy phase, hence having a config section doesn't make much sense to me.

I tend to agree with this. Except for (see below).

While I understand that my point of view is not the most common one, I would love to get some feedback about why these properties (image and registry) should go in run or deploy so we start building a common understanding of our decisions.

I think, as cross cutting values, these could be in a more general config section -- or perhaps even an image section.

image:
  name: <same as function name, typically, unless overriden by user>
  registry: <registry/user e.g. docker.io/lanceball>
  digest: <sha>

Overall this proposal looks good to me and I'm not rigid on most of these opinions. I would love to see this moving forward. Thanks for taking the time with it, @salaboy and I appreciate all of the good feedback here from everyone. 💯

lance commented 2 years ago

Hi. I just wonder if we should have a common way to config the handler function name, e.g., <module>.<function>. Because when I want to use a customized handler function name, I notice different runtime templates use different way to config and some templates even don't allow that.

Please correct me if I understand wrong, I didn't find these instructions in the documentation anyway.

Hi @daisy-ycguo you are correct that the handler function name is typically not configurable. I like the suggestion, however. While it is relevant to the func.yaml properties overall, I would like to keep this issue focused on how to move the existing structure to a more CRD-like format. Please feel free to open another issue requesting this feature, so that we can track it as its own capability. 👍

salaboy commented 2 years ago

@lance I kinda like this proposal:

image:
  name: <same as function name, typically, unless overriden by user>
  registry: <registry/user e.g. docker.io/lanceball>
  digest: <sha>

Let me think about that.. Tomorrow I will jump on this first thing in the morning..

salaboy commented 2 years ago

Team, after rebasing my old PR and doing some more work with migrations (which requires some discussions), here is the final structure you can find in the PR. This structure reflects the main lifecycle stages of func and their related properties. I liked @lance suggestion for the image field refactoring, and I believe this change enables us to implement that in a subsequent PR. I've also left for another PR the idea of supporting different environment variables (for example) for run and deploy. I believe that the changes introduced in the linked PR enable us to implement that functionality, but it can be done separately.

specVersion: 0.0.0
name: testfunc
created: 2022-05-25T22:44:47.36886+09:00
runtime: go
invocation:
  format: http
registry: ""
image: ""
imageDigest: ""
template: ""
build: 
  type: local
  git: 
    url: ""
    revision: ""
    contextDir: ""
  builderImages: []
  builder: ""
  buildpacks:
  - paketo-buildpacks/go-dist
  - ghcr.io/boson-project/go-function-buildpack:tip
  buildEnvs: []
run:
  volumes: []
  envs: []
deploy:
  namespace: ""  
  annotations: {}
  options: {}
  labels: []
  healthEndpoints:
    liveness: /health/liveness
    readiness: /health/readiness

I believe that with this new file format and the internal structs (BuildSpec, RunSpec, and DeploySpec), we can make integrating with other tools more accessible and more straightforward. As a user, you now have a hint of where to go and look (which section in the file) depending on which operation you are running.

Feedback is highly appreciated on the PR, so we can land this sooner rather than later.

rhuss commented 2 years ago

Sorry for being late to the party, just started to catch up and make my way through the comments. Before I continue (I have to jump off right now), one general comment: If you want to share certain attributes for different use cases (like env for deploy: and run:), it feels more natural to have deploy: and run: subsection of a top-level element, with the shared elements on the same level as deploy: and run:. That way it's clear that those sections can share data, otherwise it is non-intuitive, magic knowledge.

i mean something like

run:
  env:
     foo: bar
  cluster:
     # Cluster deploy-only stuff
  local:
     # Local only stuff

(also would love to have run as an umbrella for both, the cluster and local parts)

knative / func

Refactoring `func.yaml` file to follow a CRD format #817