grafana-toolbox / grafana-wtf

Grep through all Grafana entities in the spirit of git-wtf.
GNU Affero General Public License v3.0
143 stars 13 forks source link

Datasources Not in Use #70

Open meyerder opened 1 year ago

meyerder commented 1 year ago

I might suggest this tweak to finding unused Data sources..

grafana-wtf/grafana-wtf-venv/lib/python3.7/site-packages/grafana_wtf/core.py

if datasource_item.name in ["-- Grafana --", "-- Mixed --"] or datasource_item.type == "grafana" or datasource_item.uid in ["-- Grafana --", "-- Mixed --", "grafana", "-- Dashboard --"] :

Reason:
Some older dashboards that are created are being triggered. The Name aspect has moved to uid and needs to account for both..

Request: I have been trying to play with the jq syntax to extract the SQL statements of the data sources and not been able to do so yet. Do you have any suggestions?

Also a little note for those not familiar with python that much.. the version > .13 require pyton 3.7 or greater to run.

Thanks

amotl commented 1 year ago

Dear @meyerder,

thank you for your suggestions, and apologies for the late reply.

I will try to consider them on the next development iteration. If you think you could submit one or another patch to improve the situation on a few of the details you are referring to, it will be much appreciated!

With kind regards, Andreas.

amotl commented 11 months ago

Dear @meyerder,

GH-78 implements your suggestion. It has been included in release 0.15.2.

With kind regards, Andreas.

amotl commented 10 months ago

Hi again,

did you have a chance to verify if the corresponding improvement works well for you?

With kind regards, Andreas.

meyerder commented 10 months ago

I did not test... I actually accomplished what I was looking for by doing this via the Database


{
  "__inputs": [
    {
      "name": "DS_MYSQL",
      "label": "MySQL",
      "description": "",
      "type": "datasource",
      "pluginId": "mysql",
      "pluginName": "MySQL"
    }
  ],
  "__elements": {},
  "__requires": [
    {
      "type": "grafana",
      "id": "grafana",
      "name": "Grafana",
      "version": "9.5.7"
    },
    {
      "type": "datasource",
      "id": "mysql",
      "name": "MySQL",
      "version": "1.0.0"
    },
    {
      "type": "panel",
      "id": "table",
      "name": "Table",
      "version": ""
    }
  ],
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": {
          "type": "grafana",
          "uid": "-- Grafana --"
        },
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "type": "dashboard"
      }
    ]
  },
  "description": "All Dashboard Panel Querys to Datasources",
  "editable": true,
  "fiscalYearStartMonth": 0,
  "graphTooltip": 0,
  "id": null,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "datasource": {
        "type": "mysql",
        "uid": "${DS_MYSQL}"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "custom": {
            "align": "auto",
            "cellOptions": {
              "type": "auto"
            },
            "filterable": true,
            "inspect": false
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 30,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "id": 1,
      "options": {
        "cellHeight": "sm",
        "footer": {
          "countRows": false,
          "fields": "",
          "reducer": [
            "sum"
          ],
          "show": false
        },
        "showHeader": true,
        "sortBy": []
      },
      "pluginVersion": "9.5.7",
      "targets": [
        {
          "datasource": {
            "type": "mysql",
            "uid": "${DS_MYSQL}"
          },
          "editorMode": "code",
          "format": "table",
          "rawQuery": true,
          "rawSql": "SELECT\r\n  d.slug,\r\n  d.title,\r\n  panel_datasource.datasource_type,\r\n  panel_datasource.datasource_uid,\r\n  COALESCE(target_data_raw.query, target_data_expr.query, target_data_jql.query, target_data.query) AS QUERY\r\nFROM\r\n  (\r\n    SELECT\r\n      dashboard_id,\r\n      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.datasource.type')) AS datasource_type,\r\n      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.datasource.uid')) AS datasource_uid,\r\n      panel_index\r\n    FROM\r\n      (\r\n        SELECT\r\n          d.id AS dashboard_id,\r\n          panel_data,\r\n          ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY jt.panel_index) AS panel_index\r\n        FROM\r\n          dashboard AS d,\r\n          JSON_TABLE(\r\n            d.data,\r\n            \"$.panels[*]\" COLUMNS (\r\n              panel_data JSON PATH \"$\",\r\n              panel_index FOR ORDINALITY\r\n            )\r\n          ) AS jt\r\n        WHERE\r\n          d.org_id in ($ORG)\r\n      ) AS panel_data_with_index\r\n  ) AS panel_datasource\r\nLEFT JOIN\r\n  (\r\n    SELECT\r\n      dashboard_id,\r\n      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.rawSql')) AS QUERY,\r\n      panel_index\r\n    FROM\r\n      (\r\n        SELECT\r\n          d.id AS dashboard_id,\r\n          panel_data,\r\n          ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY jt.panel_index) AS panel_index\r\n        FROM\r\n          dashboard AS d,\r\n          JSON_TABLE(\r\n            d.data,\r\n            \"$.panels[*].targets[*]\" COLUMNS (\r\n              panel_data JSON PATH \"$\",\r\n              panel_index FOR ORDINALITY\r\n            )\r\n          ) AS jt\r\n        WHERE\r\n             d.org_id in ($ORG)\r\n      ) AS target_data_with_index\r\n  ) AS target_data_raw ON panel_datasource.dashboard_id = target_data_raw.dashboard_id AND panel_datasource.panel_index = target_data_raw.panel_index\r\nLEFT JOIN\r\n  (\r\n    SELECT\r\n      dashboard_id,\r\n      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.expr')) AS QUERY,\r\n      panel_index\r\n    FROM\r\n      (\r\n        SELECT\r\n          d.id AS dashboard_id,\r\n          panel_data,\r\n          ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY jt.panel_index) AS panel_index\r\n        FROM\r\n          dashboard AS d,\r\n          JSON_TABLE(\r\n            d.data,\r\n            \"$.panels[*].targets[*]\" COLUMNS (\r\n              panel_data JSON PATH \"$\",\r\n              panel_index FOR ORDINALITY\r\n            )\r\n          ) AS jt\r\n        WHERE\r\n             d.org_id in ($ORG)\r\n      ) AS target_data_with_index\r\n  ) AS target_data_expr ON panel_datasource.dashboard_id = target_data_expr.dashboard_id AND panel_datasource.panel_index = target_data_expr.panel_index\r\nLEFT JOIN\r\n  (\r\n    SELECT\r\n      dashboard_id,\r\n      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.jql')) AS QUERY,\r\n      panel_index\r\n    FROM\r\n      (\r\n        SELECT\r\n          d.id AS dashboard_id,\r\n          panel_data,\r\n          ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY jt.panel_index) AS panel_index\r\n        FROM\r\n          dashboard AS d,\r\n          JSON_TABLE(\r\n            d.data,\r\n            \"$.panels[*].targets[*]\" COLUMNS (\r\n              panel_data JSON PATH \"$\",\r\n              panel_index FOR ORDINALITY\r\n            )\r\n          ) AS jt\r\n        WHERE\r\n             d.org_id in ($ORG)\r\n      ) AS target_data_with_index\r\n  ) AS target_data_jql ON panel_datasource.dashboard_id = target_data_jql.dashboard_id AND panel_datasource.panel_index = target_data_jql.panel_index\r\nLEFT JOIN\r\n  (\r\n    SELECT\r\n      dashboard_id,\r\n      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.query')) AS QUERY,\r\n      panel_index\r\n    FROM\r\n      (\r\n        SELECT\r\n          d.id AS dashboard_id,\r\n          panel_data,\r\n          ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY jt.panel_index) AS panel_index\r\n        FROM\r\n          dashboard AS d,\r\n          JSON_TABLE(\r\n            d.data,\r\n            \"$.panels[*].targets[*]\" COLUMNS (\r\n              panel_data JSON PATH \"$\",\r\n              panel_index FOR ORDINALITY\r\n            )\r\n          ) AS jt\r\n        WHERE\r\n          d.org_id in ($ORG)\r\n      ) AS target_data_with_index\r\n  ) AS target_data ON panel_datasource.dashboard_id = target_data.dashboard_id AND panel_datasource.panel_index = target_data.panel_index\r\nJOIN dashboard AS d ON panel_datasource.dashboard_id = d.id\r\nWHERE\r\n  (target_data_raw.query IS NOT NULL AND target_data_raw.query <> '') OR\r\n  (target_data_expr.query IS NOT NULL AND target_data_expr.query <> '') OR\r\n  (target_data_jql.query IS NOT NULL AND target_data_jql.query <> '') OR\r\n  (target_data.query IS NOT NULL AND target_data.query <> '');\r\n",
          "refId": "A",
          "sql": {
            "columns": [
              {
                "parameters": [],
                "type": "function"
              }
            ],
            "groupBy": [
              {
                "property": {
                  "type": "string"
                },
                "type": "groupBy"
              }
            ],
            "limit": 50
          }
        },
        {
          "datasource": {
            "type": "mysql",
            "uid": "${DS_MYSQL}"
          },
          "editorMode": "code",
          "format": "table",
          "hide": false,
          "rawQuery": true,
          "rawSql": "select name,uid as datasource_uid from data_source ",
          "refId": "B",
          "sql": {
            "columns": [
              {
                "parameters": [],
                "type": "function"
              }
            ],
            "groupBy": [
              {
                "property": {
                  "type": "string"
                },
                "type": "groupBy"
              }
            ],
            "limit": 50
          }
        }
      ],
      "title": "Panel Title",
      "transformations": [
        {
          "id": "merge",
          "options": {}
        }
      ],
      "type": "table"
    }
  ],
  "refresh": "",
  "schemaVersion": 38,
  "style": "dark",
  "tags": [],
  "templating": {
    "list": [
      {
        "current": {},
        "datasource": {
          "type": "mysql",
          "uid": "${DS_MYSQL}"
        },
        "definition": "select distinct org_id from dashboard",
        "description": "Org ID number",
        "hide": 0,
        "includeAll": true,
        "label": "Org ID number",
        "multi": true,
        "name": "ORG",
        "options": [],
        "query": "select distinct org_id from dashboard",
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 3,
        "type": "query"
      }
    ]
  },
  "time": {
    "from": "now-6h",
    "to": "now"
  },
  "timepicker": {},
  "timezone": "",
  "title": "All Dashboard Panel Querys",
  "uid": "d297b9f7-2cad-4f57-9772-64ce6866f7d2",
  "version": 5,
  "weekStart": ""
}
meyerder commented 10 months ago

The Key part of the above that is needed is this.. The only issue is that some of the older panels that may have been created in grafana 5,6,7,8 may not show properly as they migrated at one time from the datasource Name to a datasource_uid if the panels have not been migrated the UID and Datasource_Type are empty as well as the datasource Name.. That is part of what the above was attempting to figure out.. I leveraged what I found out by using your program to implement the below as trying to do it with JQ was just a PITA for me.

SELECT
  d.slug,
  d.title,
  panel_datasource.datasource_type,
  panel_datasource.datasource_uid,
  COALESCE(target_data_raw.query, target_data_expr.query, target_data_jql.query, target_data.query) AS QUERY
FROM
  (
    SELECT
      dashboard_id,
      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.datasource.type')) AS datasource_type,
      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.datasource.uid')) AS datasource_uid,
      panel_index
    FROM
      (
        SELECT
          d.id AS dashboard_id,
          panel_data,
          ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY jt.panel_index) AS panel_index
        FROM
          dashboard AS d,
          JSON_TABLE(
            d.data,
            "$.panels[*]" COLUMNS (
              panel_data JSON PATH "$",
              panel_index FOR ORDINALITY
            )
          ) AS jt
        WHERE
          d.org_id in ($ORG)
      ) AS panel_data_with_index
  ) AS panel_datasource
LEFT JOIN
  (
    SELECT
      dashboard_id,
      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.rawSql')) AS QUERY,
      panel_index
    FROM
      (
        SELECT
          d.id AS dashboard_id,
          panel_data,
          ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY jt.panel_index) AS panel_index
        FROM
          dashboard AS d,
          JSON_TABLE(
            d.data,
            "$.panels[*].targets[*]" COLUMNS (
              panel_data JSON PATH "$",
              panel_index FOR ORDINALITY
            )
          ) AS jt
        WHERE
             d.org_id in ($ORG)
      ) AS target_data_with_index
  ) AS target_data_raw ON panel_datasource.dashboard_id = target_data_raw.dashboard_id AND panel_datasource.panel_index = target_data_raw.panel_index
LEFT JOIN
  (
    SELECT
      dashboard_id,
      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.expr')) AS QUERY,
      panel_index
    FROM
      (
        SELECT
          d.id AS dashboard_id,
          panel_data,
          ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY jt.panel_index) AS panel_index
        FROM
          dashboard AS d,
          JSON_TABLE(
            d.data,
            "$.panels[*].targets[*]" COLUMNS (
              panel_data JSON PATH "$",
              panel_index FOR ORDINALITY
            )
          ) AS jt
        WHERE
             d.org_id in ($ORG)
      ) AS target_data_with_index
  ) AS target_data_expr ON panel_datasource.dashboard_id = target_data_expr.dashboard_id AND panel_datasource.panel_index = target_data_expr.panel_index
LEFT JOIN
  (
    SELECT
      dashboard_id,
      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.jql')) AS QUERY,
      panel_index
    FROM
      (
        SELECT
          d.id AS dashboard_id,
          panel_data,
          ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY jt.panel_index) AS panel_index
        FROM
          dashboard AS d,
          JSON_TABLE(
            d.data,
            "$.panels[*].targets[*]" COLUMNS (
              panel_data JSON PATH "$",
              panel_index FOR ORDINALITY
            )
          ) AS jt
        WHERE
             d.org_id in ($ORG)
      ) AS target_data_with_index
  ) AS target_data_jql ON panel_datasource.dashboard_id = target_data_jql.dashboard_id AND panel_datasource.panel_index = target_data_jql.panel_index
LEFT JOIN
  (
    SELECT
      dashboard_id,
      JSON_UNQUOTE(JSON_EXTRACT(panel_data, '$.query')) AS QUERY,
      panel_index
    FROM
      (
        SELECT
          d.id AS dashboard_id,
          panel_data,
          ROW_NUMBER() OVER (PARTITION BY d.id ORDER BY jt.panel_index) AS panel_index
        FROM
          dashboard AS d,
          JSON_TABLE(
            d.data,
            "$.panels[*].targets[*]" COLUMNS (
              panel_data JSON PATH "$",
              panel_index FOR ORDINALITY
            )
          ) AS jt
        WHERE
          d.org_id in ($ORG)
      ) AS target_data_with_index
  ) AS target_data ON panel_datasource.dashboard_id = target_data.dashboard_id AND panel_datasource.panel_index = target_data.panel_index
JOIN dashboard AS d ON panel_datasource.dashboard_id = d.id
WHERE
  (target_data_raw.query IS NOT NULL AND target_data_raw.query <> '') OR
  (target_data_expr.query IS NOT NULL AND target_data_expr.query <> '') OR
  (target_data_jql.query IS NOT NULL AND target_data_jql.query <> '') OR
  (target_data.query IS NOT NULL AND target_data.query <> '');
amotl commented 10 months ago

Hi again,

wow, thank you for sharing your solution, good that it works for you. As I recognize that you are storing your Grafana database within MySQL/MariaDB, you are leveraging the possibility to query it for the question you have, right?

This is smart, but unfortunately it would be too specific to carry over to grafana-wtf. However, I would still like to improve it into the direction you are looking at, without needing you to resort to a jq statement.

On this matter, I think I missed your second request within your original post:

Request: I have been trying to play with the jq syntax to extract the SQL statements of the data sources and not been able to do so yet. Do you have any suggestions?

I will look into it, thanks.

With kind regards, Andreas.

amotl commented 10 months ago

Hi again,

GH-89 may have a few improvements in this area, trying to generalize your use case and solution.

Admittedly, I did not analyze your SQL statement too much [^1], so there is plenty of room I got it wrong or missed important details. When this is the case, let me know if you think we can improve from there.

With kind regards, Andreas.

[^1]: The reason mostly was because it wasn't formatted properly. That has been fixed now, see below, so I am open to align the implementation to your needs, as your selection of attributes feels sensible.

amotl commented 10 months ago

Now, after fixing your posts to make the syntax highlighter work, I can see the main attributes you are interested in for the report you are looking at.

SELECT
  d.slug,
  d.title,
  panel_datasource.datasource_type,
  panel_datasource.datasource_uid,
  COALESCE(target_data_raw.query, target_data_expr.query, target_data_jql.query, target_data.query) AS QUERY

To make the improved reporting mechanics of GH-89 behave like that, maybe without further ado, without needing any jq-based postprocessing in the future, will probably need another iteration.

amotl commented 9 months ago

Hi again,

we just released grafana-wtf 0.16.0, including the improvements from GH-89, see https://github.com/panodata/grafana-wtf/releases/tag/0.16.0.

When combining the program with jq in this way, you can generate a flat list of all data queries used within all dashboards:

grafana-wtf explore dashboards --data-details --format=json | \
    jq -r '.[].details | values[] | .[].query // "null"'

To make it work without needing any jq-based postprocessing in the future will need another iteration.

Please let us know if that works for you already, and whether it yields the correct results. If it works well, we may bring in corresponding filtering mechanisms into the code base itself.

With kind regards, Andreas.

amotl commented 9 months ago

Hi again,

after analyzing your SQL statement further, we discovered you are scanning all of expr, jql, query and rawSql attributes for query expression statements. So, we improved the implementation a bit, and released grafana-wtf 0.17.0.

With kind regards, Andreas.

Display information about data queries and their contexts within dashboards

grafana-wtf explore dashboards --data-details --queries-only --format=json

Do you think the output is reasonable?

List queries used in all dashboards

An extreme variant, filtering the output to display query expressions only.

grafana-wtf explore dashboards --data-details --queries-only --format=json | \
    jq '.[].details | values[] | .[] | .expr,.jql,.query,.rawSql | select( . != null and . != "" )'