argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
15.11k stars 3.21k forks source link

should emit realtime metrics while node not fulfilled #13440

Open fyp711 opened 3 months ago

fyp711 commented 3 months ago

Pre-requisites

What happened? What did you expect to happen?

What happened? When I use the realtime metrics in templates level. The realtime metrics not emit in every template execution.

Here is a part of my workflow template

 spec:
  templates:
    - name: gen-random-int
      retryStrategy:
        limit: 11
        retryPolicy: Always
        backoff:
          duration: '20'
          factor: 2
          maxDuration: 20m
        affinity:
          nodeAntiAffinity: {}
      metrics:
        prometheus:
          - name:  node_timeout_metrics
            help: Duration gauge by name
            when: '{{duration}} > 100'
            gauge:
              value: '{{duration}}'
              realtime: true

What did you expect to happen?

I think the realtime metrics need to be emitted as timely as possible. Look like above metrics configuration, i need when {{duration}} > 100 , than emit the metrics. But Currently, it only supports sending after the node Fulfilled.

Version(s)

main

Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

templates:
    - name: gen-random-int
      retryStrategy:
        limit: 11
        retryPolicy: Always
        backoff:
          duration: '20'
          factor: 2
          maxDuration: 20m
        affinity:
          nodeAntiAffinity: {}
      metrics:
        prometheus:
          - name:  node_timeout_metrics
            help: Duration gauge by name
            when: '{{duration}} > 100'
            gauge:
              value: '{{duration}}'
              realtime: true

Logs from the workflow controller

null

Logs from in your workflow's wait container

null
Joibel commented 3 months ago

I do not think we should implement this.

It is a bit of an anti-pattern in metrics, and if you really feel you should conditionally store metrics you can perform that filtering at the collection stage rather than during emission.

Maybe I misunderstand the request. It looks like a feature request to add when clauses to realtime metrics.

fyp711 commented 3 months ago

I will submit a fix soon

Joibel commented 3 months ago

A fix to do what?

fyp711 commented 3 months ago

@Joibel See this link https://github.com/argoproj/argo-workflows/pull/13441

fyp711 commented 3 months ago

A fix to do what?

@Joibel If the node is not fulfilled, real-time metrics should be computed and emitted image