Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
2.03k stars 578 forks source link

Slow servicegroups page in icingaweb2 #9017

Open Yoda-BZH opened 3 years ago

Yoda-BZH commented 3 years ago

Describe the bug

The page /monitoring/list/servicegroups can be very slow, to the point of not responding, due to missing indexes A clear and concise description of what the bug is.

To Reproduce

Using:

I narrowed the slowness in the following sub query :

SELECT sg.alias COLLATE latin1_general_ci AS servicegroup_alias,
       sgo.name1                          AS servicegroup_name,
       CASE
         WHEN ( ss.problem_has_been_acknowledged
                + ss.scheduled_downtime_depth
                + COALESCE(hs.current_state, 0) ) > 0 THEN 1
         ELSE 0
       end                                AS service_handled,
       CASE
         WHEN ss.current_state = 0 THEN
         CASE WHEN ss.has_been_checked = 0 OR ss.has_been_checked IS
         NULL THEN
         16 ELSE
         0
         end + CASE WHEN ss.problem_has_been_acknowledged = 1 THEN 2
         ELSE CASE
         WHEN
         ss.scheduled_downtime_depth > 0
         THEN 1 ELSE 4 end end
         ELSE CASE WHEN ss.has_been_checked = 0 OR ss.has_been_checked
              IS NULL
              THEN 16
              WHEN
              ss.current_state = 1 THEN 32 WHEN ss.current_state = 2
              THEN 128
              WHEN
              ss.current_state =
              3 THEN 64 ELSE 256 end + CASE WHEN hs.current_state > 0
              THEN 1024
              ELSE
              CASE WHEN ss.problem_has_been_acknowledged = 1 THEN 512
              ELSE CASE
              WHEN
              ss.scheduled_downtime_depth > 0 THEN 256 ELSE 2048 end end
              end
       end                                AS service_severity,
       CASE
         WHEN ss.has_been_checked = 0
               OR ss.has_been_checked IS NULL THEN 99
         ELSE ss.current_state
       end                                AS service_state
FROM   icinga_objects AS sgo
       INNER JOIN icinga_servicegroups AS sg
               ON sg.servicegroup_object_id = sgo.object_id
                  AND sgo.objecttype_id = 4
                  AND sgo.is_active = 1
       INNER JOIN icinga_servicegroup_members AS sgm
               ON sgm.servicegroup_id = sg.servicegroup_id
       INNER JOIN icinga_objects AS so
               ON so.object_id = sgm.service_object_id
                  AND so.objecttype_id = 2
                  AND so.is_active = 1
       INNER JOIN icinga_services AS s
               ON s.service_object_id = so.object_id
       INNER JOIN icinga_hoststatus AS hs
               ON hs.host_object_id = s.host_object_id
       INNER JOIN icinga_servicestatus AS ss
               ON ss.service_object_id = so.object_id
GROUP  BY sgo.object_id,
          sg.servicegroup_id,
          ss.servicestatus_id,
          hs.hoststatus_id
UNION ALL
SELECT sg.alias COLLATE latin1_general_ci AS servicegroup_alias,
       sgo.name1                          AS servicegroup_name,
       NULL                               AS service_handled,
       0                                  AS service_severity,
       NULL                               AS service_state
FROM   icinga_objects AS sgo
       INNER JOIN icinga_servicegroups AS sg
               ON sg.servicegroup_object_id = sgo.object_id
                  AND sgo.objecttype_id = 4
                  AND sgo.is_active = 1
       LEFT JOIN icinga_servicegroup_members AS esgm
              ON esgm.servicegroup_id = sg.servicegroup_id
GROUP  BY sgo.object_id,
          sg.servicegroup_id
HAVING ( COUNT(esgm.servicegroup_member_id) = 0 )

An explain of the query before:

+------+-------------+-------+--------+----------------------------------------------------------+------------------------------+---------+---------------------------------------------------------+------+----------+-------------------------------------------------+
| id   | select_type | table | type   | possible_keys                                            | key                          | key_len | ref                                                     | rows | filtered | Extra                                           |
+------+-------------+-------+--------+----------------------------------------------------------+------------------------------+---------+---------------------------------------------------------+------+----------+-------------------------------------------------+
|    1 | PRIMARY     | hs    | ALL    | object_id                                                | NULL                         | NULL    | NULL                                                    | 3319 |   100.00 | Using where; Using temporary; Using filesort    |
|    1 | PRIMARY     | sg    | ALL    | PRIMARY                                                  | NULL                         | NULL    | NULL                                                    | 32   |   100.00 | Using where; Using join buffer (flat, BNL join) |
|    1 | PRIMARY     | sgo   | eq_ref | PRIMARY,objecttype_id,sla_idx_obj                        | PRIMARY                      | 8       | icinga2.sg.servicegroup_object_id                       | 1    |     0.04 | Using where                                     |
|    1 | PRIMARY     | sgm   | ref    | sgmbrs_sgid_soid,idx_servicegroup_members_object_id      | sgmbrs_sgid_soid             | 9       | icinga2.sg.servicegroup_id                              | 374  |   100.00 | Using where; Using index                        |
|    1 | PRIMARY     | so    | eq_ref | PRIMARY,objecttype_id,sla_idx_obj                        | PRIMARY                      | 8       | icinga2.sgm.service_object_id                           | 1    |    50.00 | Using where                                     |
|    1 | PRIMARY     | ss    | eq_ref | object_id                                                | object_id                    | 9       | icinga2.sgm.service_object_id                           | 1    |   100.00 |                                                 |
|    1 | PRIMARY     | s     | ref    | services_host_object_id_idx,services_combined_object_idx | services_combined_object_idx | 18      | icinga2.sgm.service_object_id,icinga2.hs.host_object_id | 1    |   100.00 | Using index                                     |
|    2 | UNION       | sgo   | ref    | PRIMARY,objecttype_id,sla_idx_obj                        | sla_idx_obj                  | 12      | const,const                                             | 21   |   100.00 | Using index; Using temporary; Using filesort    |
|    2 | UNION       | sg    | ALL    | NULL                                                     | NULL                         | NULL    | NULL                                                    | 32   |   100.00 | Using where; Using join buffer (flat, BNL join) |
|    2 | UNION       | esgm  | ref    | sgmbrs_sgid_soid                                         | sgmbrs_sgid_soid             | 9       | icinga2.sg.servicegroup_id                              | 374  |   100.00 | Using index                                     |
+------+-------------+-------+--------+----------------------------------------------------------+------------------------------+---------+---------------------------------------------------------+------+----------+-------------------------------------------------+

Expected behavior

The query should be quick.

Screenshots

MariaDB [icinga2]> SELECT sg.alias COLLATE latin1_general_ci AS servicegroup_alias,                sgo.name1                          AS servicegroup_name,                CASE                  WHEN ( ss.problem_has_been_acknowledged                         + ss.scheduled_downtime_depth
        + COALESCE(hs.current_state, 0) ) > 0 THEN 1                  ELSE 0                end                                AS service_handled,                CASE                  WHEN ss.current_state = 0 THEN                  CASE WHEN ss.has_been_checked = 0 OR ss.has_been_checked IS
      NULL THEN                  16 ELSE                  0                  end + CASE WHEN ss.problem_has_been_acknowledged = 1 THEN 2                  ELSE CASE                  WHEN                  ss.scheduled_downtime_depth > 0                  THEN 1 ELSE 4 end end                  ELSE CASE WHEN ss.has_been_checked = 0 OR ss.has_been_checked                       IS NULL                       THEN 16                       WHEN                       ss.current_state = 1 THEN 32 WHEN ss.current_state = 2                       THEN 128                       WHEN                       ss.current_state =                       3 THEN 64 ELSE 256 end + CASE WHEN hs.current_state > 0                       THEN 1024                       ELSE                       CASE WHEN ss.problem_has_been_acknowledged = 1 THEN 512                       ELSE CASE                       WHEN
   ss.scheduled_downtime_depth > 0 THEN 256 ELSE 2048 end end                       end                end                                AS service_severity,                CASE                  WHEN ss.has_been_checked = 0                        OR ss.has_been_checked IS NULL THEN 99
 ELSE ss.current_state                end                                AS service_state         FROM   icinga_objects AS sgo                INNER JOIN icinga_servicegroups AS sg                        ON sg.servicegroup_object_id = sgo.object_id                           AND sgo.objecttype_id = 4
                       AND sgo.is_active = 1                INNER JOIN icinga_servicegroup_members AS sgm                        ON sgm.servicegroup_id = sg.servicegroup_id                INNER JOIN icinga_objects AS so                        ON so.object_id = sgm.service_object_id
      AND so.objecttype_id = 2                           AND so.is_active = 1                INNER JOIN icinga_services AS s                        ON s.service_object_id = so.object_id                INNER JOIN icinga_hoststatus AS hs                        ON hs.host_object_id = s.host_object_id
           INNER JOIN icinga_servicestatus AS ss                        ON ss.service_object_id = so.object_id         GROUP  BY sgo.object_id,                   sg.servicegroup_id,                   ss.servicestatus_id,                   hs.hoststatus_id         UNION ALL         SELECT sg.alias COLLATE latin1_general_ci AS servicegroup_alias,                sgo.name1                          AS servicegroup_name,                NULL                               AS service_handled,                0                                  AS service_severity,                NULL
    AS service_state         FROM   icinga_objects AS sgo                INNER JOIN icinga_servicegroups AS sg                        ON sg.servicegroup_object_id = sgo.object_id                           AND sgo.objecttype_id = 4                           AND sgo.is_active = 1                LEFT JOIN icinga_servicegroup_members AS esgm                       ON esgm.servicegroup_id = sg.servicegroup_id         GROUP  BY sgo.object_id,                   sg.servicegroup_id         HAVING ( COUNT(esgm.servicegroup_member_id) = 0 );

^CCtrl-C -- query killed. Continuing normally.
ERROR 1317 (70100): Query execution was interrupted
MariaDB [icinga2]> 

The solution

I added an index on icinga_servicegroups using the command:

MariaDB [icinga2]> CREATE INDEX idx_servicegroup_object_id ON icinga_servicegroups (servicegroup_object_id);

Query OK, 0 rows affected (0.008 sec)
Records: 0  Duplicates: 0  Warnings: 0

The explain extended of the query with the index:

+------+-------------+-------+--------+----------------------------------------------------------+------------------------------+---------+-------------------------------+------+----------+----------------------------------------------+
| id   | select_type | table | type   | possible_keys                                            | key                          | key_len | ref                           | rows | filtered | Extra                                        |
+------+-------------+-------+--------+----------------------------------------------------------+------------------------------+---------+-------------------------------+------+----------+----------------------------------------------+
|    1 | PRIMARY     | sgo   | ref    | PRIMARY,objecttype_id,sla_idx_obj                        | sla_idx_obj                  | 12      | const,const                   | 21   |   100.00 | Using index; Using temporary; Using filesort |
|    1 | PRIMARY     | sg    | ref    | PRIMARY,idx_servicegroup_object_id                       | idx_servicegroup_object_id   | 9       | icinga2.sgo.object_id         | 1    |   100.00 |                                              |
|    1 | PRIMARY     | sgm   | ref    | sgmbrs_sgid_soid,idx_servicegroup_members_object_id      | sgmbrs_sgid_soid             | 9       | icinga2.sg.servicegroup_id    | 374  |   100.00 | Using where; Using index                     |
|    1 | PRIMARY     | so    | eq_ref | PRIMARY,objecttype_id,sla_idx_obj                        | PRIMARY                      | 8       | icinga2.sgm.service_object_id | 1    |    50.00 | Using where                                  |
|    1 | PRIMARY     | s     | ref    | services_host_object_id_idx,services_combined_object_idx | services_combined_object_idx | 9       | icinga2.sgm.service_object_id | 1    |   100.00 | Using where; Using index                     |
|    1 | PRIMARY     | hs    | eq_ref | object_id                                                | object_id                    | 9       | icinga2.s.host_object_id      | 1    |   100.00 |                                              |
|    1 | PRIMARY     | ss    | eq_ref | object_id                                                | object_id                    | 9       | icinga2.sgm.service_object_id | 1    |   100.00 |                                              |
|    2 | UNION       | sgo   | ref    | PRIMARY,objecttype_id,sla_idx_obj                        | sla_idx_obj                  | 12      | const,const                   | 21   |   100.00 | Using index; Using temporary; Using filesort |
|    2 | UNION       | sg    | ref    | idx_servicegroup_object_id                               | idx_servicegroup_object_id   | 9       | icinga2.sgo.object_id         | 1    |   100.00 |                                              |
|    2 | UNION       | esgm  | ref    | sgmbrs_sgid_soid                                         | sgmbrs_sgid_soid             | 9       | icinga2.sg.servicegroup_id    | 374  |   100.00 | Using index                                  |
+------+-------------+-------+--------+----------------------------------------------------------+------------------------------+---------+-------------------------------+------+----------+----------------------------------------------+

The query computes now in :

12196 rows in set (0.125 sec)

Your Environment

Include as many relevant details about the environment you experienced the problem in

Additional context

Add any other context about the problem here.

sklaes commented 3 years ago

Hey @Yoda-BZH, please close this issue and reopen it here: https://github.com/Icinga/icingaweb2/

Regarding to your problem this could also help: https://github.com/Icinga/icingaweb2/pull/4324

But please open that icingaweb2 issue in any case.

Yoda-BZH commented 3 years ago

Hello,

The schema to be modified is in this repository (https://github.com/Icinga/icinga2/blob/master/lib/db_ido_mysql/schema/mysql.sql), not in icingaweb2.

sklaes commented 3 years ago

yes the topic can also be seen in this way, but the problem occurs in icingaweb2 and it can (also) be addressed in icingaweb2 – that’s why I recommended to open an icingaweb2 issue.

Al2Klimov commented 3 years ago

@nilmerg Please transfer.

nilmerg commented 3 years ago

The solution is a new index, I cannot see how a similar effect can be achieved with Icinga Web 2. Besides, the same change is done as part of https://github.com/Icinga/icingaweb2/pull/4231.

Al2Klimov commented 2 years ago

CREATE INDEX idx_servicegroup_object_id ON icinga_servicegroups (servicegroup_object_id);

@Yoda-BZH Feel free to open a PR.