archlinux / archweb

Arch Linux website code
https://archlinux.org
GNU General Public License v2.0
314 stars 128 forks source link

/mirror/status/ is slow #517

Open jelly opened 1 month ago

jelly commented 1 month ago

Loading /mirrors/status/ takes ~ 2 seconds on our production env. this needs some optimisation.

christian-heusel commented 1 month ago

Can we just precompute it via cronjob every idk 20 min and serve the result? 😆

jelly commented 1 month ago

So some analysis, locally nothing is shown because I need to tweak the DEFAULT_CUTOFF to DEFAULT_CUTOFF = timedelta(hours=100) in mirrors/utils.py`

SELECT "mirrors_mirrorlog"."url_id",
       "mirrors_mirrorlog"."error",
       COUNT("mirrors_mirrorlog"."error") AS "error_count",
       MAX("mirrors_mirrorlog"."check_time") AS "last_occurred"
  FROM "mirrors_mirrorlog"
 INNER JOIN "mirrors_mirrorurl"
    ON ("mirrors_mirrorlog"."url_id" = "mirrors_mirrorurl"."id")
 INNER JOIN "mirrors_mirror"
    ON ("mirrors_mirrorurl"."mirror_id" = "mirrors_mirror"."id")
 WHERE ("mirrors_mirrorlog"."check_time" >= '2024-07-18T14:12:56.565672+00:00'::timestamptz AND NOT "mirrors_mirrorlog"."is_success" AND "mirrors_mirror"."public" AND "mirrors_mirrorurl"."active" AND "mirrors_mirror"."active" AND "mirrors_mirror"."public")
 GROUP BY "mirrors_mirrorlog"."url_id",
          "mirrors_mirrorlog"."error"
 ORDER BY 4 DESC,
          3 DESC

This query locally takes 500 ms(!), on our live env the page consistenly loads in ~ 2 seconds.

                                                               QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=525187.86..525234.16 rows=18518 width=31)
   Sort Key: (max(mirrors_mirrorlog.check_time)) DESC, (count(mirrors_mirrorlog.error)) DESC
   ->  GroupAggregate  (cost=413.78..523875.25 rows=18518 width=31)
         Group Key: mirrors_mirrorlog.url_id, mirrors_mirrorlog.error
         ->  Incremental Sort  (cost=413.78..523504.89 rows=18518 width=23)
               Sort Key: mirrors_mirrorlog.url_id, mirrors_mirrorlog.error
               Presorted Key: mirrors_mirrorlog.url_id
               ->  Nested Loop  (cost=1.13..522889.88 rows=18518 width=23)
                     ->  Nested Loop  (cost=0.56..889.93 rows=1027 width=4)
                           ->  Index Scan using mirrors_mirrorurl_pkey on mirrors_mirrorurl  (cost=0.28..310.97 rows=1685 width=8)
                                 Filter: active
                           ->  Memoize  (cost=0.29..0.75 rows=1 width=4)
                                 Cache Key: mirrors_mirrorurl.mirror_id
                                 Cache Mode: logical
                                 ->  Index Scan using mirrors_mirror_pkey on mirrors_mirror  (cost=0.28..0.74 rows=1 width=4)
                                       Index Cond: (id = mirrors_mirrorurl.mirror_id)
                                       Filter: (public AND active AND public)
                     ->  Index Scan using mirrors_mirrorlog_url_id_check_time on mirrors_mirrorlog  (cost=0.56..508.02 rows=26 width=23)
                           Index Cond: ((url_id = mirrors_mirrorurl.id) AND (check_time >= '2024-07-18 16:12:56.565672+02'::timestamp with time zone))
                           Filter: (NOT is_success)
 JIT:
   Functions: 24
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(23 rows)

This is get_mirror_errors which is not cached by us and thus explains why the page loads slowly.

get_mirror_statuses is also slow ~ 300 ms locally:

archweb=> explain SELECT l.url_id,
       u.mirror_id,
       COUNT(l.id) AS check_count,
       COUNT(l.last_sync) AS success_count,
       MAX(l.last_sync) AS last_sync,
       MAX(l.check_time) AS last_check,
       AVG(l.duration) AS duration_avg,
       STDDEV(l.duration) AS duration_stddev,
       AVG(check_time - last_sync) AS delay
  FROM mirrors_mirrorlog l
  JOIN mirrors_mirrorurl u
    ON u.id = l.url_id
 WHERE l.check_time >= '2024-07-18T14:12:56.193089+00:00'::timestamptz
 GROUP BY l.url_id,
          u.mirror_id;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=617361.15..623734.47 rows=192402 width=72)
   Group Key: l.url_id, u.mirror_id
   Planned Partitions: 8
   ->  Nested Loop  (cost=0.85..597158.94 rows=192402 width=36)
         ->  Index Scan using mirrors_mirrorlog_check_time on mirrors_mirrorlog l  (cost=0.56..591390.08 rows=192402 width=32)
               Index Cond: (check_time >= '2024-07-18 16:12:56.193089+02'::timestamp with time zone)
         ->  Memoize  (cost=0.29..0.77 rows=1 width=8)
               Cache Key: l.url_id
               Cache Mode: logical
               ->  Index Scan using mirrors_mirrorurl_pkey on mirrors_mirrorurl u  (cost=0.28..0.76 rows=1 width=8)
                     Index Cond: (id = l.url_id)
 JIT:
   Functions: 14
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(14 rows)
jelly commented 1 month ago

Applied a workaround for now in 6edcaaa474f26db79b7434c3d0ccce461b0bd89a

HBwilliam commented 1 month ago

have you tried

SELECT "mirrors_mirrorlog"."url_id",
       "mirrors_mirrorlog"."error",
       COUNT("mirrors_mirrorlog"."error") AS "error_count",
       MAX("mirrors_mirrorlog"."check_time") AS "last_occurred"
  FROM "mirrors_mirrorlog"
 INNER JOIN "mirrors_mirrorurl"
    ON ("mirrors_mirrorlog"."url_id" = "mirrors_mirrorurl"."id" AND "mirrors_mirrorurl"."active")
 INNER JOIN "mirrors_mirror"
    ON ("mirrors_mirrorurl"."mirror_id" = "mirrors_mirror"."id" AND "mirrors_mirror"."public" AND "mirrors_mirror"."active")
 WHERE "mirrors_mirrorlog"."check_time" >= '2024-07-18T14:12:56.565672+00:00'::timestamptz
       AND NOT "mirrors_mirrorlog"."is_success"
 GROUP BY "mirrors_mirrorlog"."url_id",
          "mirrors_mirrorlog"."error"
 ORDER BY 4 DESC,
          3 DESC

filtering public and active earlier should help, fingers crossed