[Refactor][Grafana] DORA dashboard improvement

Devlake version: v1.0.1-beta5

Performance issue

DORA dashboard. Panels Overall DORA Metrics and Change Failure Rate, SQL queries take too long time to execute.

SQL query part:

SELECT
    CASE
      WHEN COUNT(i.id) = 0
      AND COUNT(cdc.id) = 0 THEN 'No All'
      WHEN COUNT(i.id) = 0 THEN 'No Incidents'
      WHEN COUNT(cdc.id) = 0 THEN 'No Deployments'
    END AS is_collected
  FROM
    (
      SELECT
        1
    ) AS dummy
    LEFT JOIN incidents i ON 1 = 1
    LEFT JOIN cicd_deployment_commits cdc ON 1 = 1;

+--------------+
| is_collected |
+--------------+
| NULL         |
+--------------+
1 row in set (31.70 sec)

Investigation

select count(*) from cicd_deployment_commits;
+----------+
| count(*) |
+----------+
|   177462 |
+----------+
1 row in set (0.03 sec)

select count(*) from incidents;
+----------+
| count(*) |
+----------+
|     3193 |
+----------+
1 row in set (0.00 sec)

explain analyze SELECT
    CASE
      WHEN COUNT(i.id) = 0
      AND COUNT(cdc.id) = 0 THEN 'No All'
      WHEN COUNT(i.id) = 0 THEN 'No Incidents'
      WHEN COUNT(cdc.id) = 0 THEN 'No Deployments'
    END AS is_collected
  FROM
    (
      SELECT
        1
    ) AS dummy
    LEFT JOIN incidents i ON 1 = 1
    LEFT JOIN cicd_deployment_commits cdc ON 1 = 1;

 -> Aggregate: count(cdc.id), count(i.id), count(cdc.id), count(i.id)  (cost=113e+6 rows=1) (actual time=55353..55353 rows=1 loops=1)
    -> Left hash join (no condition)  (cost=56.5e+6 rows=565e+6) (actual time=53.4..24112 rows=567e+6 loops=1)
        -> Nested loop left join  (cost=412 rows=3260) (actual time=0.0327..3.23 rows=3193 loops=1)
            -> Rows fetched before execution  (cost=0..0 rows=1) (actual time=71e-6..111e-6 rows=1 loops=1)
            -> Covering index scan on i using incidents_resolution_date_idx  (cost=412 rows=3260) (actual time=0.0312..2.82 rows=3193 loops=1)
        -> Hash
            -> Covering index scan on cdc using cicd_deployment_commits_finished_date_idx  (cost=6.9 rows=173184) (actual time=0.0239..29.4 rows=177462 loops=1)

Actual result: cost=113e+6

Solution

Avoid to join full tables

SELECT
    CASE
      WHEN i.cnt = 0
      AND cdc.cnt = 0 THEN 'No All'
      WHEN i.cnt = 0 THEN 'No Incidents'
      WHEN cdc.cnt = 0 THEN 'No Deployments'
    END AS is_collected
  FROM
    (
      select count(*) as cnt from incidents
    ) AS i
    LEFT JOIN (select count(*) as cnt from cicd_deployment_commits) as cdc ON 1 = 1;

+--------------+
| is_collected |
+--------------+
| NULL         |
+--------------+
1 row in set (0.02 sec)

But I think that we should filter counters by projects and dates and the best solution should looks like:

SELECT
    CASE
      WHEN i.cnt = 0
      AND cdc.cnt = 0 THEN 'No All'
      WHEN i.cnt = 0 THEN 'No Incidents'
      WHEN cdc.cnt = 0 THEN 'No Deployments'
    END AS is_collected
  FROM
    (
      select count(*) as cnt from incidents i
      join project_mapping pm on i.scope_id = pm.row_id and pm.`table` = i.`table`
      WHERE pm.project_name in (${project}) and $__timeFilter(i.created_date)
    ) AS i
    LEFT JOIN (
      select count(*) as cnt from cicd_deployment_commits cdc
      JOIN project_mapping pm on cdc.cicd_scope_id = pm.row_id and pm.`table` = 'cicd_scopes'
      WHERE pm.project_name in (${project}) and $__timeFilter(cdc.finished_date)
    ) as cdc ON 1 = 1

Hey there, @amartynov! I'm here to help you with any bugs, questions, or contributions you have. Let's squash those bugs together! 🤖

To improve the performance of the DORA dashboard in Grafana, you can optimize the SQL queries by filtering counters by projects and dates. The solution you proposed is on the right track. By joining the incidents and cicd_deployment_commits tables with the project_mapping table and applying the __timeFilter macro, you can efficiently filter the data based on the selected projects and date range. This approach avoids joining full tables and should significantly reduce the query execution time.

Here's the optimized query you suggested:

SELECT
    CASE
      WHEN i.cnt = 0
      AND cdc.cnt = 0 THEN 'No All'
      WHEN i.cnt = 0 THEN 'No Incidents'
      WHEN cdc.cnt = 0 THEN 'No Deployments'
    END AS is_collected
  FROM
    (
      select count(*) as cnt from incidents i
      join project_mapping pm on i.scope_id = pm.row_id and pm.`table` = i.`table`
      WHERE pm.project_name in (${project}) and $__timeFilter(i.created_date)
    ) AS i
    LEFT JOIN (
      select count(*) as cnt from cicd_deployment_commits cdc
      JOIN project_mapping pm on cdc.cicd_scope_id = pm.row_id and pm.`table` = 'cicd_scopes'
      WHERE pm.project_name in (${project}) and $__timeFilter(cdc.finished_date)
    ) as cdc ON 1 = 1

This query should help improve the performance of the DORA dashboard by ensuring that only relevant data is processed.

To continue talking to Dosu, mention @dosu.

apache / incubator-devlake