argoproj / argo-rollouts

Progressive Delivery for Kubernetes
https://argo-rollouts.readthedocs.io/
Apache License 2.0
2.8k stars 876 forks source link

Broken Datadog apiVersion v2 analyses not specifying aggregator in v1.7.1 #3707

Open aguiraorodriguez opened 4 months ago

aguiraorodriguez commented 4 months ago

Checklist:

Describe the bug

Datadog analyses using apiVersion: v2 that don't specify an aggregator fail. The error message is:

"Error Message: received non 2xx response code: 400 {"errors":["Value of parameter 'aggregator' should be any of these [None, 'avg', 'min', 'max', 'sum', 'last', 'mean', 'area', 'l2norm', 'percentile', 'stddev']"]}"

Probably this is a side effect of https://github.com/argoproj/argo-rollouts/pull/3643 when trying to fix the analyses of v1.

To Reproduce

Use a Datadog analysis with apiVersion: v2 and don't specify aggregator:

    - name: latency-slo-v2
      interval: 60s
      count: 3
      successCondition: result >= 0.95
      failureLimit: 2
      provider:
        datadog:
          apiVersion: v2
          interval: 60s
          queries:
            a: |
              sum:http_request_duration_seconds.bucket{
                upper_bound:0.5,
                short_image:sre-podinfo,
                kube_namespace:canary-analysis,
                env:{{args.environment}},
                kube_cluster_name:{{args.clusterName}},
                kube_service:{{args.service-name}}
              }.as_rate()
            b: |
              sum:http_request_duration_seconds.count{
                short_image:sre-podinfo,
                kube_namespace:canary-analysis,
                env:{{args.environment}},
                kube_cluster_name:{{args.clusterName}},
                kube_service:{{args.service-name}}
              }.as_rate()
          formula: "a / b"

Expected behavior

Screenshots

Version

v1.7.1

Logs

time="2024-07-05T11:25:18Z" level=warning msg="Measurement had error: received non 2xx response code: 400 {\"errors\":[\"Value of parameter 'aggregator' should be any of these [None, 'avg', 'min', 'max', 'sum', 'last', 'mean', 'area', 'l2norm', 'percentile', 'stddev']\"]}" analysisrun=podinfo-78544d7dd7-285-4 metric=latency-slo-v2 namespace=canary-analysis

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

rvega-clearroute commented 4 months ago

try adding aggregator: sum under datadog (same level. as apiVersion)

aguiraorodriguez commented 4 months ago

try adding aggregator: sum under datadog (same level. as apiVersion)

But the docs state that is optional, and indeed the DD API response should be any of these [None, 'avg', 'min', 'max', 'sum', 'last', 'mean', 'area', 'l2norm', 'percentile', 'stddev'] allows it, so I think it should be possible not to specify one.

rvega-clearroute commented 4 months ago

they have removed the default value in the version 1.7.1 https://github.com/argoproj/argo-rollouts/compare/v1.7.0...v1.7.1

aguiraorodriguez commented 4 months ago

they have removed the default value in the version 1.7.1 https://github.com/argoproj/argo-rollouts/compare/v1.7.0...v1.7.1

Correct but they did it because they think it would work without a value with Datadog API v2, which seems not to be the case. The fix is thus not complete and this is broken.