Workflow Template CRD - Githubissues

0xdevalias commented 6 years ago

Is this a BUG REPORT or FEATURE REQUEST?: FEATURE REQUEST

So I was looking for any issues related to 'composing' argo templates (or reusable snippets), decided to make a new one for this. For reference, the others I found:

https://github.com/argoproj/argo/issues/799: 'Macro Runner'
https://github.com/argoproj/argo/issues/795: 'HTTP Template Type'
https://github.com/argoproj/argo/issues/732: Compose Workflows
https://github.com/argoproj/argo/issues/576: Compose Workflows
https://github.com/argoproj/argo/issues/625 (?): DAG

Quite likely this relates to what is already discussed in the above with regards to ksonnet/etc:

I've been working in AWS-land recently, specifically CloudFormation/SAM, and came across this snippet today regarding transforms:

AWS CloudFormation transforms help simplify template authoring by condensing the expression of AWS infrastructure as code and enabling reuse of template components. For example, you can condense a multiple-line resource declaration into a single line in your template.

AWS CloudFormation supports AWS::Serverless and AWS::Include transform types:

In particular, the AWS::Include transform would be a really useful concept to have in Argo. That way we could 'decompose' our templates into more manageable chunks and have them 'stitched back together'.

Following this example, but simplifying, at the top of an argo workflow yaml you could add a new entry such as to 'enable' the include transform support:

Transforms:
  - 'Argo::Include'

By doing this, no old workflows will be affected, and templates have to 'opt in' for support.

Then, in the template you use something like the following:

Foo:
  Bar:
    SomeNestedKeyspace:
      'Fn::Transform':
        Name: 'Argo::Include'
        Parameters:
          Location: "./foo/bar/baz"

Location could be a local file, S3/minio/etc bucket, etc. Potentially could be any 'artifact' source argo supports maybe?

I see this as being a 'pre transform' step, before things actually get submitted to run in k8s itself. Something like:

Get workflow yaml
Does it have transforms?
If yes, for each transform type we know about: run the associated transform code
Remove the 'transforms' key from our output
Ensure the transformed workflow is valid
If valid, submit/run/etc

For the include transform:

Find any YAML keys called Fn::Transform
If transform name is Argo::Include
Load the file from the Location parameter
Make sure it is yaml/etc
Replace this Fn::Transform key + all children with the root key of the included yaml
Continue processing additional Argo::Include transforms in this file

If you wanted to get fancy, you could then handle 'nested' transforms, eg. if I include a file that also has include transforms. This could be handled by either:

At the point of 'include', process the 'template snippet' for transforms, then include that transformed snippet (need to watch out for depth issues/circular dependencies maybe?)
After including the untransformed yaml, do a 'new pass' over this workflow, finding the new transforms, handling as appropriate. I see this as basically being a loop: find first unprocessed transform, process it, loop.

I think the basic implementation of this shouldn't actually be too hard. I may end up hacking together some code for one of our projects that does basically this, so if/when I get to that, happy to share.

I also feel like this could be a nice way to allow people to 'extend' or prototype new argo features without having to change too much in the 'core', and without adding too much 'bloat' if a user doesn't actually use the feature. Might allow the 'core spec' to remain relatively simple?

paveq commented 5 years ago

Has it been thought if template steps inclusion on run-time from another K8S resource would be useful? Instead of defining templates from single workflow.yaml file, you would submit another workflow resource and include steps from there to current workflow?

paveq commented 5 years ago

Rough idea of this, where static analysis step would be managed from diffent resource:

  templates:
  - name: main
    steps:
    - - name: checkout
        template: checkout
    - - name: build
        template: build
        arguments:
          artifacts:
          - name: source
            from: "{{steps.checkout.outputs.artifacts.source}}"
    - - name: static-analysis
        template: static-analysis
          valueFrom:
            templateRef:
               name: namespace/workflow-name
               template: static-analysis
        arguments:
          artifacts:
          - name: source
            from: "{{steps.checkout.outputs.artifacts.source}}"

Ark-kun commented 5 years ago

That YAML does not seem correct.

xlucas commented 5 years ago

So is there any new stuff about this? I was looking at #1233 also. We would really see the possibility to split a workflow into different files as a very good thing for maintainability and reusability, as it gets cumbersome with the current one-file-per-workflow approach.

dtaniwaki commented 5 years ago

If no one has any objections about the direction, I'll try to implement new CRD Template.

The brief summary of the CRD functions is:

Allow nested template references
Support variables mapped in importing templates
Allow overwriting and merging values of imported templates in the importee.
Allow referencing Templates of other namespaces.
Hold no status like configmap

e.g.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: foo
spec:
  entrypoint: bar
  templates:
  - templateRef:
      name: tmpl1
      template: something
      vars: # bind variables to template
        ver: 9.4
    name: bar # overwrites name
  - templateRef:
       name: tmpl1
       template: foobar
---
apiVersion: argoproj.io/v1alpha1
kind: Template
metadata:
  name: tmpl1
spec:
   templates:
  - name: something
    container:
      image: debian:{{ver}}
      command: [sh, -c]
      args: ["echo debian"]
  - name: foobar
    steps:
    - - templateRef:
           name: tmpl2
           template: foo
    - - templateRef:
           name: tmpl2
           template: bar
---
apiVersion: argoproj.io/v1alpha1
kind: Template
metadata:
  name: tmpl2
spec:
  templates:
  - name: foo
    container:
      image: ubuntu:16.04
      command: [sh, -c]
      args: ["echo foo"]
  - name: bar
    container:
      image: ubuntu:16.04
      command: [sh, -c]
      args: ["echo bar"]

xlucas commented 5 years ago

@dtaniwaki This looks really good 👍. With this definition, would it be possible to interpolate parameters when passing them in the vars section? For instance:

vars:
  ver: {{inputs.parameters.version}}

dtaniwaki commented 5 years ago

@xlucas I have a prototype working well in my local and will create a pull request soon.

As for your question, I changed the format a little bit. I decided to utilize arguments and the arguments are resolved per reference.

e.g.

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: workflow-template-whalesay-template
spec:
  templates:
  - name: whalesay-template
    inputs:
      parameters:
      - name: message
    container:
      image: docker/whalesay
      command: [cowsay]
      args: ["{{inputs.parameters.message}}"]
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: workflow-template-nested-template
spec:
  templates:
  - name: whalesay-inner-template
    templateRef:
      name: workflow-template-hello-world-template
      template: whalesay-template
    inputs:
      parameters:
      - name: message
    arguments:
      parameters:
      - name: message
        value: "{{inputs.parameters.message}} hello"
  - name: whalesay-template
    template: whalesay-inner-template
    inputs:
      parameters:
      - name: message
    arguments:
      parameters:
      - name: message
        value: "{{inputs.parameters.message}} hello"
---
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: workflow-template-nested-
spec:
  entrypoint: whalesay
  templates:
  - name: whalesay
    templateRef:
      name: workflow-template-nested-template
      template: whalesay-template
    arguments:
      parameters:
      - name: message
        value: "hello"

This workflow is supposed to print hello hello hello.

I think it handles your case. What do you think?

xlucas commented 5 years ago

Well, that's even better! Using arguments is indeed more idiomatic so there's no discrepancy with the current template spec. I feel like it would fit really nicely with the current workflow structure, from a user standpoint.

However, shouldn't

spec:
  templates:
  - name: whalesay-inner-template
    templateRef:
      name: workflow-template-hello-world-template
      template: whalesay-template

Be

spec:
  templates:
  - name: whalesay-inner-template
    templateRef:
      name: workflow-template-whalesay-template
      template: whalesay-template

Instead?

dtaniwaki commented 5 years ago

Good catch! I fixed it in the PR. Thanks!

dtaniwaki commented 5 years ago

I finished the implementation in #1312. Please review it and give me your feedback!

xlucas commented 5 years ago

Amazing! I will take a 👀 at it. Thanks!

dtaniwaki commented 5 years ago

I think we can close this issue.

argoproj / argo-workflows

Workflow Template CRD #927