argoproj / argo-rollouts

Progressive Delivery for Kubernetes
https://argo-rollouts.readthedocs.io/
Apache License 2.0
2.73k stars 852 forks source link

Add SLO support to the Datadog MetricProvider #3668

Open philippeVV opened 3 months ago

philippeVV commented 3 months ago

Use Cases

We wish to use the difference in SLO burn rate between the stable and the canary release during analysis.

Why it doesn’t work

SLOs are not supported by the /api/v2/query/scaler endpoint.

Expected behaviour

Query Datadog /api/v2/query/timeseries endpoint to retrieve the SLO burn rate for the slo_id defined in the CR. Aggregate the time series to have a single value to compare during analysis.

The /api/v2/query/timeseries endpoint is currently in beta. While not stated as supported in the documentation, Datadog support confirmed it is available.

Request format:

{
  "data": [
    {
      "type": "timeseries_request",
      "attributes": {
        "queries": [
          {
            "name": "query1",
            "data_source": "slo",
            "slo_id": "<paste slo uuid here>",
            "measure": "burn_rate",
            "group_mode": "overall",
            "slo_query_type": "metric"
          }
        ],
        "from": 1706810112753,
        "to": 1707414912753,
        "formulas": [
          {
            "formula": "query1"
          }
        ]
      }
    }
  ],
}

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

philippeVV commented 3 months ago

We are interested in contributing if there is interest in the feature and agreement on using the /api/v2/query/timeseries beta endpoint. As far as I know, this endpoint is currently the only way to query SLO metrics.

When querying the endpoint for SLO, it sends the following warning with the response: WARNING: Using unstable operation 'v2.QueryScalarData'