Create initial workload resources that include changes from trait and scope processing

kminder commented 3 years ago

We are experiencing timing issues with the application of changes resulting from trait and scope processing to application resources. In particular we have an example of workload type of which known customer applications can take several minutes to start or restart. The implementation model suggested by the oam-kubernetes-runtime is to update the application’s workload resource or its eventual children resources after they are created. This is causing the application to be restarted since the operator typically begins processing the resources when they are first created and may then need to restart them when a change is later detected. In addition we cannot guarantee that a user’s application will not lose data if it is restarted unexpectedly.

As a result, having an OAM standard way to apply changes for traits and scopes to workloads before the workload resources are actually created will be important for us. How can this be done?

For example consider a custom workload type similar to Deployment that declares Pods as children in the workload definition. Then imagine a custom scope (e.g. logging) that would like to ensure that all Pods are created with a specific side car. In order to avoid the creation of Pods without the side car the workload itself needs to be created correctly so that the custom workload operator initially creates the Pods correctly.

ryanzhang-oss commented 3 years ago

Thanks for the question. I think this is more of an implementation detail than a spec level issue. In our specific Kubernetes based implementation, we use a special trait called "patchTrait" that does exactly what you described. This type of trait takes effect before the application controller emits the corresponding workload to the runtime. For example, it can insert a logging sidecar to the workload podTemplate. This avoids the case that the workload spec is modified after it is created which normally leads to a restart of the application.

@wonderflow can point you to our specific issue/solutions. However, please keep in mind that it's a very runtime dependant solution.

wonderflow commented 3 years ago

Hi @kminder thanks for the question! I think this issue could be solved in the upcoming Application abstraction like below.

apiVersion: core.oam.dev/v1alpha2
kind: Application
metadata:
  name: application-sample
spec:
  components:
    - name: myweb
      type: worker
      settings:
        image: "busybox"
        cmd:
        - sleep
        - "1000"
      traits:
        - name: sidercar
          properties:
            name: "sidecar-test"
            image: "nginx"

It will gather cue template in WorkloadDefinition and TraitDefinition like below:

apiVersion: core.oam.dev/v1alpha2
kind: WorkloadDefinition
metadata:
  name: worker
  annotations:
    definition.oam.dev/description: "Long-running scalable backend worker without network endpoint"
spec:
  definitionRef:
    name: deployments.apps
  extension:
    template: |
      output: {
        apiVersion: "apps/v1"
        kind:       "Deployment"
        spec: {
            selector: matchLabels: {
                "app.oam.dev/component": context.name
            }
            template: {
                metadata: labels: {
                    "app.oam.dev/component": context.name
                }
                spec: {
                    containers: [{
                        name:  context.name
                        image: parameter.image

                        if parameter["cmd"] != _|_ {
                            command: parameter.cmd
                        }
                    }]
                }
            }
        }
      }
      parameter: {
        // +usage=Which image would you like to use for your service
        // +short=i
        image: string
        cmd?: [...string]
      }

apiVersion: core.oam.dev/v1alpha2
kind: TraitDefinition
metadata:
  annotations:
    definition.oam.dev/description: "add sidecar to the app"
  name: sidecar
spec:
  appliesToWorkloads:
    - webservice
    - worke
  extension:
    template: |-
      patch: {
         // +patchKey=name
         spec: template: spec: containers: [parameter]
      }
      parameter: {
         name: string
         image: string
         command?: [...string]
      }

It can easily handle the "patch" style trait.

kminder commented 3 years ago

I've previously raised my objection to suggestions that new non-spec user facing artifacts (e.g. Application example above) are an appropriate way to address OAM spec limitations. We consider the OAM constructs to be the user facing contract.

Also WRT to the statement about this issue being "an implementation detail" I at least partially disagree. A solution here seems like it will require updates to how traits and scope are defined. This is illustrated by what KubeVela has done in their TraitDefinition example above. The use of spec.extension seems to violate the statement "The implementation should NOT abuse this field for functional purpose." in the spec.

kminder commented 3 years ago

Also @wonderflow does KubeVela have a patch scope and have patch traits been implemented in oam-kubernetes-runtime?

resouer commented 3 years ago

@kminder Yes I believe we have reached a concrete agreement on there's will be NO non-spec user facing artifacts. For the upcoming v0.2.2 release, please check its transition from v1alpha1 to v1alpha2 (release v0.2.2) as below (it's still in draft phase so feel free to propose if you have any idea/opinion):

In detail:

ComponentSchematic is replaced by WorkloadDefinition, and all schema related features will be moved to definition object (i.e. parameters and template).
Similarly, Traits is renamed to TraitsDefinitions. ALL the definition objects are sharable and reusable across apps.
ApplicationConfiguration is renamed to Application, and it still represents the whole instance of the app.
- The .workloadSettings is moved to this object (done in v0.2.1).
- The .parameterValues will be temporarily deprecated and re-introduced as another object in user scenarios like cross-environments deploy (non-goal in v0.2.2)

Please also check https://github.com/oam-dev/spec/issues/429#issuecomment-761190917 for full background of this proposal and happy to discuss it at any time.

Also WRT to the statement about this issue being "an implementation detail" I at least partially disagree

I can understand your point but the spec alone can't empower us to do patch (or anything else). So in the impl, we started several approaches to solve this in parallel (e.g. patch by trait implementation), and finally patch by templating/DCL won out by its simplicity and efficiency. CUE is the first attempt and Helm templating is coming after.

The argument will be what's the spec of definition objects then. The full spec is not ready yet. Note that spec.extension is marked as EXPERIMENTAL , it's intended to be deprecated and replaced by template, parameters etc (see: https://github.com/oam-dev/kubevela/issues/976). My proposal is we will need to finalize the end user interface (Application) first and iterate platform builder interface (Definition) driven by real world implementation (should be mostly finalized after Helm support is done).

Also @wonderflow does KubeVela have a patch scope and have patch traits been implemented in oam-kubernetes-runtime?

I am personally not sure about the future of scope, from most user cases (and your user case), it should be more like a trait that attached to multiple components? (then we should reuse existing trait impl). The capability of patch has already shipped last week (https://kubevela.io/#/en/cue/trait?id=patch-trait), the oam-kubernetes-runtime, by design, is the dependent lib to make this happen in KubeVela.

kminder commented 3 years ago

@resouer Thanks for the detailed information. Very helpful.

However, I'm still unclear as to where this new capability is implemented. I need to know if it is currently implemented in https://github.com/crossplane/oam-kubernetes-runtime or only in https://github.com/oam-dev/kubevela. I looked through the commit log comments of oam-kubernetes-runtime and didn't see anything that obviously implemented patch traits. Can you please clarify?

resouer commented 3 years ago

@kminder Hi Kevin, note that ALL features including patch trait are implemented in KubeVela. Modification made to oam-kubernetes-runtime during developing KubeVela will be ported back to that repo.

Why that?

oam-kubernetes-runtime is by design a lib (please check its README and description) which can not even be deployed independently (it doesn't have main function for long time until we recently added one for testing purpose). It provides very limited but essential plumbing functionalities such as resource composition (enables us to build OAM based abstraction), revisioning (enables workload versioning for rollout) and orchestration (enables dependency, topology mgmt and data input/output).
After the lib is released, we move a step further to create a full OAM platform named rudr2 with it, while Crossplane maintainers have concern that such app layer system is out of their scope (they mainly focus on infra layer). Hence, rudr2 is created in OAM org with new name KubeVela instead of in Crossplane org.

I will expect oam-kubernetes-runtime lib be merged into KubeVela (probably after KubeVela 1.0), but since some community users still consume the lib directly (i.e. they built their own "KubeVela") , a transition time is needed for them to migrate to KubeVela runtime, we are actively tracking this progress.

kminder commented 3 years ago

@resouer This is concerning to us and our use case. We currently consume oam-kubernetes-runtime directly as part of our platform. We also have no interest in much of what KubeVela is becoming. For example the CLI and UI are of no interest. On the surface it appears that oam-kubernetes-runtime and KubeVela are being merged out of convenience to the KubeVela project because contributing back to oam-kubernetes-runtime is extra work.

If oam-kubernetes-runtime and KubeVela are merged into a single repo it is important to us that pieces be individually consumable (e.g. library, operator, UI/CLI, etc.)

resouer commented 3 years ago

@kminder Sorry but it seems there's a misunderstanding on what KubeVela is and where it's going to. KubeVela is by design not a half-baked lib or a UI/CLI, it's a k8s controller to provide full OAM features (i.e. an unified app delivery system). If you are familiar with what oam runtime is, there should be nothing different except supporting latest OAM spec and full feature set (e.g. patch trait). The reason it's not continued developing in oam runtime repo has been clarified in other comments.

kminder commented 3 years ago

@resouer | @ryanzhang-oss Will the examples above https://github.com/oam-dev/spec/issues/425#issuecomment-761991987 work with KubeVela 0.3.3 or 0.4.0-rc-master ? I assume the example isn't actually tested since appliesToWorkloads: - worke has a typo.

Noticing this raises an important question. How can a TraitDefinition's appliesToWorkloads ever be complete if WorkloadDefinitions are expected to be added frequently? If WorkloadDefinition versions must be named differently this problem is even worse.

resouer commented 3 years ago

Will the examples above

425 (comment)

work with KubeVela 0.3.3 or 0.4.0-rc-master ?

@kminder Please kindly refer to the documentation (https://kubevela.io/#/en/cue/trait) for supported features in latest release of vela.

Noticing this raises an important question.

Actually this is not a problem at all. appliesToWorkloads is by design work with workload characteristic, not specific component descriptor object (WorkloadDefinition). This is documented in spec (https://github.com/oam-dev/spec/blob/master/4.workload_types.md#labels) and has a tracking issue (https://github.com/oam-dev/spec/issues/392). However, it's not fully implemented since there's no enough resource to tackle it.

oam-dev / spec

Create initial workload resources that include changes from trait and scope processing #425

425 (comment)