argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
15.12k stars 3.21k forks source link

Allow waiting for more complex conditions on resource templates #2288

Open opyh opened 4 years ago

opyh commented 4 years ago

Summary

From what I see, Argo's successCondition and failCondition matching is not powerful enough to wait for a pod to be ready. I can wait for it to have a status.phase = Running condition, but not check if it's actually ready. I can only check for values that are not inside an array and have a key patch matching /([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9]'/).

There are many more cases like this - It seems there is no clean way to wait for a value of any field wrapped inside an unordered array, for example - which is bad considering that k8s own resources and CRDs often have unordered arrays wrapping status information inside status, for example like this in Pod (note that this pod has status.phase = Running but is not actually ready to be used by following workflow steps):

status:
  phase: Running
  conditions:
    - type: Ready
      status: 'False'
    ...

Motivation

Intuitively I expected this feature to work with k8s default resources at least, which it only does partially. It looks like the k8s label selector implementation was not actually meant to be used for matching anything outside labels – which are only one level deep and namespaced using DNS syntax.

Proposal

While a workaround seems to be to replicate Argo's wait condition functionality with a custom template that has more complex matching, this feels dirty :)

Internally it seems Argo is already using gjson to create a set of paths to match against label selectors, but only a small subset of GJSON Path Syntax can actually be used as of now.

I'd propose a condition string to allow a GJSON selector that has to evaluate to true to fulfill the condition. This would allow for more powerful checks like status.conditions.#(type=="Ready").status == "True" on a Pod.

To stay compatible with existing code like successCondition: status.phase = Running and allow future extensions, successCondition could allow something like this:

successCondition:
  gsonPath: status.conditions.#(type=="Ready").status == "True"

jq and other expression support would be cool, too :)


Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

simster7 commented 4 years ago

Good suggestion. Would you be interested in contributing a fix?

iven commented 4 years ago

I tried to dig into this issue, here is some notes:

This is why Argo fails to parse complex conditions like status.conditions.#(type=="Ready").status.

From Argo's own side, it uses gjson to query json values, and already supports the above syntax.

To fix this issue, we should either change apimachinery's code, or implement a different one. I don't know which is better, and I hope these notes help others who want to contribute to this issue.