GoogleCloudPlatform / prometheus-engine

Google Cloud Managed Service for Prometheus libraries and manifests.
https://g.co/cloud/managedprometheus
Apache License 2.0
196 stars 92 forks source link

Feature request: GMP query API works with "pint" tool (support the ?stats=1 query parameter) #1054

Open faevourite opened 4 months ago

faevourite commented 4 months ago

Prometheus supports stats=1 as part of the /api/v1/query and api/v1/query_range endpoints. It returns an additional json field called "stats" with some execution timings. For /query:

{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": ["..."],
    "stats": {
      "timings": {
        "evalTotalTime": 0.0000875,
        "resultSortTime": 0,
        "queryPreparationTime": 0.000051542,
        "innerEvalTime": 0.000029166,
        "execQueueTime": 0.00004725,
        "execTotalTime": 0.000141459
      },
      "samples": {
        "totalQueryableSamples": 1,
        "peakSamples": 1
      }
    }
  }
}

Right now this parameter causes gmp-frontend to return an error:

{
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unknown name \"stats\": Cannot bind query parameter. Field 'stats' could not be found in request message.",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.BadRequest",
        "fieldViolations": [
          {
            "description": "Invalid JSON payload received. Unknown name \"stats\": Cannot bind query parameter. Field 'stats' could not be found in request message."
          }
        ]
      }
    ]
  }
}

This means we can't make use of some excellent Prometheus tooling out there like pint to lint for expensive queries, because it relies on the stats=1 parameter (and not just for that one check, but all that call Prometheus APIs). Is it possible to increase compatibility with Prometheus and start supporting this param?

bwplotka commented 4 months ago

Thanks! Great request. 💪🏽

The main challenge is to map internal workings of Google Cloud Monitoring PromQL engine to those stats which assumes Prometheus' TSDB PromQL engine.

We could make it work, at the minimum level, so tools like [pint](https://cloudflare.github.io/pint/checks/query/cost.html could work OR we could contribute to pint to make sure it's more generic. Anyway, the acceptance criteria here:

Acceptance Criteria