michelin / snowflake-grafana-datasource

Snowflake grafana datasource plugin allows Snowflake data to be visually represented in Grafana dashboards.
Apache License 2.0
66 stars 33 forks source link

Metric Request Error on simple recursive query #29

Open merc1031 opened 1 year ago

merc1031 commented 1 year ago

Grafana Version: v7.5.5 (b5190ee547) Plugin Version: 1.2.0

I have the following query

WITH RECURSIVE recurseruns AS (
 SELECT p.id as pr, p.created_at as pr_create, runs.upstream_run_id, runs.run_id, runs.job_name, runs.started, runs.completed
 FROM "FIVETRAN_DATABASE"."FB_CI_METRICS_PUBLIC".runs
 JOIN "FIVETRAN_DATABASE"."FB_CI_METRICS_PUBLIC".premerge_stats p
 ON p.run_id = runs.run_id
  WHERE p.run_id IS NOT NULL
  AND CONVERT_TIMEZONE('UTC', 'UTC', p.created_at::TIMESTAMP_NTZ) < '2022-10-11T17:18:15.49Z' AND CONVERT_TIMEZONE('UTC', 'UTC', p.created_at::TIMESTAMP_NTZ) > '2022-10-11T11:18:15.49Z'
  UNION ALL
    SELECT rt.pr as pr, rt.pr_create as pr_create, r.upstream_run_id, r.run_id, r.job_name, r.started, r.completed
    FROM "FIVETRAN_DATABASE"."FB_CI_METRICS_PUBLIC".runs r
    JOIN recurseruns rt ON rt.run_id = r.upstream_run_id
  )
SELECT
  rr.pr_create as "time",
  SUM(CASE WHEN (rr.job_name LIKE 's3_%') THEN TIMEDIFF('seconds', rr.started, rr.completed) END) as s3_machine_time
FROM recurseruns rr
group by (rr.pr, "time")
order by "time";

The error


{
  "request": {
    "url": "api/ds/query",
    "method": "POST",
    "data": {
      "queries": [
        {
          "refId": "A",
          "key": "Q-56419b51-e0d2-40db-8641-8828c213e003-0",
          "queryText": "WITH RECURSIVE recurseruns AS (\n SELECT p.id as pr, p.created_at as pr_create, runs.upstream_run_id, runs.run_id, runs.job_name, runs.started, runs.completed\n FROM \"FIVETRAN_DATABASE\".\"FB_CI_METRICS_PUBLIC\".runs\n JOIN \"FIVETRAN_DATABASE\".\"FB_CI_METRICS_PUBLIC\".premerge_stats p\n ON p.run_id = runs.run_id\n  WHERE p.run_id IS NOT NULL\n  AND CONVERT_TIMEZONE('UTC', 'UTC', p.created_at::TIMESTAMP_NTZ) < '2022-10-11T17:18:15.49Z' AND CONVERT_TIMEZONE('UTC', 'UTC', p.created_at::TIMESTAMP_NTZ) > '2022-10-11T11:18:15.49Z'\n  UNION ALL\n    SELECT rt.pr as pr, rt.pr_create as pr_create, r.upstream_run_id, r.run_id, r.job_name, r.started, r.completed\n    FROM \"FIVETRAN_DATABASE\".\"FB_CI_METRICS_PUBLIC\".runs r\n    JOIN recurseruns rt ON rt.run_id = r.upstream_run_id\n  )\nSELECT\n  rr.pr_create as \"time\",\n  SUM(CASE WHEN (rr.job_name LIKE 's3_%') THEN TIMEDIFF('seconds', rr.started, rr.completed) END) as s3_machine_time\nFROM recurseruns rr\ngroup by (rr.pr, \"time\")\norder by \"time\";",
          "queryType": "table",
          "timeColumns": [
            "time"
          ],
          "datasourceId": 78,
          "intervalMs": 1000,
          "maxDataPoints": 3778
        }
      ],
      "range": {
        "from": "2022-10-11T18:27:59.275Z",
        "to": "2022-10-11T19:27:59.275Z",
        "raw": {
          "from": "now-1h",
          "to": "now"
        }
      },
      "from": "1665512879275",
      "to": "1665516479275"
    },
    "hideFromInspector": false
  },
  "response": {
    "message": "Metric request error"
  }
}```
devnied commented 1 year ago

Can you please add the result/status of the query from the Snowflake query history?

merc1031 commented 1 year ago

Ran again

WITH RECURSIVE recurseruns AS (
  -- Grab the data
 SELECT p.id as pr, p.created_at as pr_create, runs.upstream_run_id, runs.run_id, runs.job_name, runs.started, runs.completed
 FROM "FIVETRAN_DATABASE"."FB_CI_METRICS_PUBLIC".runs
 JOIN "FIVETRAN_DATABASE"."FB_CI_METRICS_PUBLIC".premerge_stats p
 ON p.run_id = runs.run_id
  WHERE p.run_id IS NOT NULL
  AND CONVERT_TIMEZONE('UTC', 'UTC', p.created_at::TIMESTAMP_NTZ) < '2022-10-11T17:18:15.50Z' AND CONVERT_TIMEZONE('UTC', 'UTC', p.created_at::TIMESTAMP_NTZ) > '2022-10-11T11:18:15.50Z'
  UNION ALL
    SELECT rt.pr as pr, rt.pr_create as pr_create, r.upstream_run_id, r.run_id, r.job_name, r.started, r.completed
    FROM "FIVETRAN_DATABASE"."FB_CI_METRICS_PUBLIC".runs r
    JOIN recurseruns rt ON rt.run_id = r.upstream_run_id
  )
SELECT
  rr.pr_create as "time",
  SUM(CASE WHEN (rr.job_name LIKE 's3_%') THEN TIMEDIFF('seconds', rr.started, rr.completed) END) as s3_machine_time
FROM recurseruns rr
group by (rr.pr, "time")
order by "time";
{
  "request": {
    "url": "api/ds/query",
    "method": "POST",
    "data": {
      "queries": [
        {
          "refId": "A",
          "key": "Q-8c5076ea-ad81-45d6-bef3-64a28e66e477-0",
          "queryText": "WITH RECURSIVE recurseruns AS (\n  -- Grab the data\n SELECT p.id as pr, p.created_at as pr_create, runs.upstream_run_id, runs.run_id, runs.job_name, runs.started, runs.completed\n FROM \"FIVETRAN_DATABASE\".\"FB_CI_METRICS_PUBLIC\".runs\n JOIN \"FIVETRAN_DATABASE\".\"FB_CI_METRICS_PUBLIC\".premerge_stats p\n ON p.run_id = runs.run_id\n  WHERE p.run_id IS NOT NULL\n  AND CONVERT_TIMEZONE('UTC', 'UTC', p.created_at::TIMESTAMP_NTZ) < '2022-10-11T17:18:15.50Z' AND CONVERT_TIMEZONE('UTC', 'UTC', p.created_at::TIMESTAMP_NTZ) > '2022-10-11T11:18:15.50Z'\n  UNION ALL\n    SELECT rt.pr as pr, rt.pr_create as pr_create, r.upstream_run_id, r.run_id, r.job_name, r.started, r.completed\n    FROM \"FIVETRAN_DATABASE\".\"FB_CI_METRICS_PUBLIC\".runs r\n    JOIN recurseruns rt ON rt.run_id = r.upstream_run_id\n  )\nSELECT\n  rr.pr_create as \"time\",\n  SUM(CASE WHEN (rr.job_name LIKE 's3_%') THEN TIMEDIFF('seconds', rr.started, rr.completed) END) as s3_machine_time\nFROM recurseruns rr\ngroup by (rr.pr, \"time\")\norder by \"time\";",
          "queryType": "table",
          "timeColumns": [
            "time"
          ],
          "datasourceId": 78,
          "intervalMs": 1000,
          "maxDataPoints": 3778
        }
      ],
      "range": {
        "from": "2022-10-11T20:48:23.377Z",
        "to": "2022-10-11T21:48:23.378Z",
        "raw": {
          "from": "now-1h",
          "to": "now"
        }
      },
      "from": "1665521303377",
      "to": "1665524903378"
    },
    "hideFromInspector": false
  },
  "response": {
    "message": "Metric request error"
  }
}

image

image

merc1031 commented 1 year ago

I ran the query from above in query inspector again today. In explore mode, Once in table mode, then switched to timeseries mode. I got this in query history image

Then i ran it with run_query and got this image Then switching to table mode once more image

It looks like in timeseries mode , on the snowflake side, the history looks a bit different?

Profile for the fast history queries image Profile for the slower ones image

It looks like the queries are succeeding in snowflake?

merc1031 commented 1 year ago

Some more details with a similar query.

Grafana log shows the following

logger=cleanup t=2023-04-27T03:14:08.063059559Z level=info msg="Completed cleanup jobs" duration=34.019807ms
logger=context userId=11 orgId=1 uname="***********" t=2023-04-27T03:14:15.540213049Z level=error msg="Internal server error" error="[plugin.downstreamError] failed to query data: Failed to query data: rpc error: code = Unavailable desc = error reading from server: EOF" remote_addr=10.231.221.166 traceID=

on the snowflake gui, we see image with interstingly NO results image

However if i follow the query history and have it pop a new tab with the exact query the plugin sent, but in a worksheet, then the query returns results. image