argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
15.08k stars 3.2k forks source link

proposal: change expression syntax to not overlap with Helm #13471

Open isubasinghe opened 3 months ago

isubasinghe commented 3 months ago

Summary

We currently support multiple templating/expression engines without any strong reason. This issue attempts to formalize what the new expression syntax should look like.

Suggestion

We support expr-lang as is, we feed an environment mapping containing all variables needed when compiling/running said expression. The values in this map can be accessed by using the $ prefix.

This will obviously be a breaking change, but hopefully will simplify a giant part of the expression/templating logic that exists in argo workflows and hopefully reduce user confusion in the process.

We can deprecate existing solutions in the next release, then remove these dependencies entirely in the next release + 1.

Before

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-world-parameters-
spec:
  entrypoint: print-message
  arguments:
    parameters:
      - name: message
        value: hello world
  templates:
    - name: print-message
      inputs:
        parameters:
          - name: message
      container:
        image: busybox
        command: [ echo ]
        args: [ "{{inputs.parameters.message}}" ]

After

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-world-parameters-
spec:
  entrypoint: print-message
  arguments:
    parameters:
      - name: message
        value: hello world
  templates:
    - name: print-message
      inputs:
        parameters:
          - name: message
      container:
        image: busybox
        command: [ echo ]
        args: [ "$inputs.parameters.message" ]
isubasinghe commented 3 months ago

Copying @agilgur5 's comment here

I've written about my plans to standardize on expr in a few other places (#7576 (comment), #12694, #12693 (comment), etc), figured this issue should really be the central source of truth for that and so will document here.

  1. Close all related/duplicate issues (did this some months ago)

  2. "Soft deprecate" govaluate by removing it from the docs and examples and replacing it with expr everywhere, per https://github.com/argoproj/argo-workflows/issues/7576#issuecomment-1728725234

  3. "Hard" deprecate govaluate in 3.7 by detecting its usage (or non-expr usage) and logging out a deprecation warning saying to switch to expr

    • 3.7 is an example matching the current status: 3.6 is the next minor, so if the docs are updated in 3.5's time, give it one more minor to settle in
  4. Fully remove govaluate in 3.8 and error out when its usage is detected, saying to switch to expr

    • 3.8 is an example, should be at least one minor after the previous step

Per #7576, as when already supports expr (albeit with some confusing error messages), we can do this without a breaking spec change to the CRs

This covers govaluate but not quite fasttemplate (#7810) or text/template -- the same process could be followed for those. govaluate is arguably the most confusing one (based on all the issues around it etc) to focus initial efforts on. The other, non-expression templating systems are also documented in the same place as expr, so govaluate is kind of the outlier in that sense as it is only used in when.

agilgur5 commented 3 months ago

We currently support multiple templating/expression engines without any strong reason.

Legacy and backward-compat is the main reason at this point. Each templating engine was superior to the prior historically (govaluate became unmaintained)

We can deprecate existing solutions in the next release, then remove these dependencies entirely in the next release + 1.

I would strictly define "next release" as 3.7+ here, IMO. With the OTel change (#13265), I think we have enough breakage in 3.6, which is also overdue due to 3.5 instabilities.

        args: [ "$inputs.parameters.message" ]

This syntax seems overly simplistic -- complex expressions need to be bounded on both edges somehow. Unless you're proposing that all strings become expressions? I feel like that too could be problematic The $ also has some existing overlap with expr's $env. We should probably coordinate with @antonmedv on standardizing that. Honestly it might even be worth collaborating on a YAML wrapper "interpreter" library with built-in expression support on strings or something, since that would be more generic and could be used outside of Argo. Changing the syntax from {{= and }} also seems like adding another confusing syntax for users. I would prefer to split that out as its own breaking change, once expr is already standardized.

I'm also not sure what the purpose of another duplicate of #9529 is, especially if you're verbatim copying text from it and referencing it? I did take a decent amount of time in finding and closing all duplicates and linking related issues, which making more of kind of defeats the purpose of

isubasinghe commented 3 months ago

This syntax seems overly simplistic -- complex expressions need to be bounded on both edges somehow. Unless you're proposing that all strings become expressions?

Yeah I was proposing this, I don't see a huge issue here If I am being honest, simplicity is why I proposed this. Users can directly rely on expr-lang's syntax & semantics without having to care about anything else. One negative is perhaps that strings need to now be quoted.

Changing the syntax from {{= and }} also seems like adding another confusing syntax for users. I would prefer to split that out as its own breaking change, once expr is already standardized.

Yeah that is fair enough.

I'm also not sure what the purpose of another duplicate of https://github.com/argoproj/argo-workflows/issues/9529 is, especially if you're verbatim copying text from it and referencing it? I did take a decent amount of time in finding and closing all duplicates and linking related issues, which making more of kind of defeats the purpose of

The purpose was effectively a direct suggestion about what to do about #9529, but perhaps I should have opened a PR with design documentation referencing #9529. I can close this and open that PR if you'd like?

agilgur5 commented 3 months ago

I don't see a huge issue here If I am being honest, simplicity is why I proposed this

Completely off the top of my head, escaping sounds like a substantial issue here. An example off the top of my head, + is an operator in expr but not in regular YAML strings.

The purpose was effectively a direct suggestion about what to do about #9529, but perhaps I should have opened a PR with design documentation referencing #9529. I can close this and open that PR if you'd like?

If there's an existing issue already, I'd rather just re-use it, as I did with sharding, for instance. And I gave concrete steps for most of it in https://github.com/argoproj/argo-workflows/issues/9529#issuecomment-1980011539 already. I wouldn't even consider the part I covered a proposal per se, but an action item of things we already need to do.

Otherwise, I did like Alan's recent use of "proposal:" issues, and did the same with #12694 (also including the existing term "RFC"). I was thinking of formalizing that as an issue template.

Yeah that is fair enough.

In this case, I would probably suggest leaving #9529 as-is and making this proposal specific to future syntax replacements for less overlap with tools like Helm etc, which my action items did not cover and would probably require more feedback as well (vs the specific items I laid out I think there is past and existing consensus or precedent on; the "deprecation" did not have consensus, and also had an older proposal in #7831 before expr was usable in when, but that window in general does have precedent, such as with the Emissary Executor)

Joibel commented 3 months ago

Otherwise, I did like Alan's recent use of "proposal:" issues, and did the same with #12694 (also including the existing term "RFC"). I was thinking of formalizing that as an issue template.

I agree with an issue template for this. docs/proposals sort of appears how we do proposals at first glance, but I think doing them as issues is how we prefer to work and is overall a better process.

antonmedv commented 3 months ago

The $ also has some existing overlap with expr's $env. We should probably coordinate with @antonmedv on standardizing that.

Yes, I was looking for some "global" variable name which should be more unique. I was considering env and global before. Is $env interferers somehow with Argo defined variables?

Legacy and backward-compat is the main reason at this point.

I'm going to create a parser for govaluate for ease of migration to expr. Probably this parser can be used to simplify migration to expr.

agilgur5 commented 3 months ago

Is $env interferers somehow with Argo defined variables?

No, not currently, but Isitha was suggesting here to change Argo's {{= }} to use $ instead, which could create overlap. If we change Argo's delimiter syntax, it would be good to coordinate with you on that.

Honestly it might even be worth collaborating on a YAML wrapper "interpreter" library with built-in expression support on strings or something, since that would be more generic and could be used outside of Argo.

If we create a shared "YAML with expr" library that would force coordination too 😉

I'm going to create a parser for govaluate for ease of migration to expr. Probably this parser can be used to simplify migration to expr.

Interesting. If it's just for Argo's case, I don't think it'd be worthwhile since only 1 field uses govaluate (when). The rest uses either expr or plain substitution with fasttemplate. But if you want to support other users of govaluate (idk how many remain at this point) that could certainly be useful.

antonmedv commented 3 months ago

If we create a shared "YAML with expr" library that would force coordination too 😉

I like this idea. 🔥 will try to research into this.

But if you want to support other users of govaluate (idk how many remain at this point) that could certainly be useful.

I think somebody still uses it.

agilgur5 commented 2 months ago

I discussed this with Isitha in the Aug 20th Contributor Meeting and pointed to my above comment on how the $ syntax would not suffice.

Isitha also agreed that we can limit this issue as a proposal for syntax replacements, while keeping #9529 as-is (that does mention overlap with Helm, but no proposal/solution to it, whereas it does directly state to replace other templating libraries with expr and my further comment outlines how).

As such, I'll rename this issue to be specific to syntax replacements. Given the problems with $ though, this would need to go back to the drawing board

MasonM commented 1 month ago

Another problem with $ is it conflicts with shell scripts. It's common to use templating with shell code, e.g.: https://github.com/argoproj/argo-workflows/blob/ad114b0472c0079cb7a982f7f577bc1e965b310b/examples/scripts-bash.yaml#L37-L40

Ideally, the new delimiter wouldn't conflict with languages commonly used in workflows. I did a little research into what other projects are doing:

Also, it'd be great if the new syntax would be stricter and wouldn't silently failing on errors, which has been the biggest source of confusion for me, and is mentioned in the comments: https://github.com/argoproj/argo-workflows/blob/ad114b0472c0079cb7a982f7f577bc1e965b310b/util/template/expression_template.go#L55-L60

We aren't going to have a chance to make that sort of change in the future because it would be too big of a BC break.

agilgur5 commented 1 month ago

Another problem with $ is it conflicts with shell scripts. It's common to use templating with shell code, e.g.:

Well we should probably disable templating inside of scripts in any case and force use of env vars to prevent injection attacks per https://github.com/argoproj/argo-workflows/issues/5114

Also, it'd be great if the new syntax would be stricter and wouldn't silently failing on errors, which has been the biggest source of confusion for me, and is mentioned in the comments

Yea I think most people agree that it is very confusing, I said as much in https://github.com/argoproj/argo-workflows/pull/13680#issuecomment-2381792128 too

We did partly workaround that with the approach I outlined in https://github.com/argoproj/argo-workflows/pull/13680#issuecomment-2381800890, although it is not entirely possible to fix due to layering/nesting. But we could statically validate and error out on a good number of templates that have less dynamism

I did a little research into what other projects are doing:

Thanks for listing this out. I wouldn't necessarily recommend this, but one option is to allow delimiters to be specified in the spec itself. That becomes pretty complicated with nesting etc, but it makes any syntax possible and the Workflow spec still portable (unlike a Controller configuration or similar).