29th / personnel-v3

Personnel management system version 3
https://www.29th.org
2 stars 6 forks source link

Optimize unit attendance queries #303

Closed wilson29thid closed 3 months ago

wilson29thid commented 3 months ago

Unit attendance pages take a very long time to load. Need to run EXPLAIN ANALYZE on the query and dig in to why.

wilson29thid commented 3 months ago

query

WITH `attendance_stats` AS
  (SELECT `attendance`.`event_id`,
          sum(attended = TRUE) AS attended_count,
          count(id) AS expected_count,
          sum(attended = FALSE
              AND excused = FALSE) AS absent_count
   FROM `attendance`
   GROUP BY `attendance`.`event_id`)
SELECT *,
       `attendance_stats`.`attended_count`,
       `attendance_stats`.`expected_count`,
       `attendance_stats`.`absent_count`
FROM `events`
LEFT OUTER JOIN `attendance_stats` ON `attendance_stats`.`event_id` = `events`.`id`
WHERE `events`.`unit_id` IN
    (SELECT `units`.`id`
     FROM `units`
     WHERE (`units`.`ancestry` LIKE '1631/%'
            OR `units`.`ancestry` = '1631'
            OR `units`.`id` = 1631))
  AND `events`.`starts_at` <= '2024-05-01'
ORDER BY `events`.`starts_at` DESC
LIMIT 25
OFFSET 0

explain

id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 PRIMARY units index PRIMARY,index_units_on_ancestry index_units_on_ancestry 768 1797 20.04 Using where; Using index; Using temporary; Using filesort
1 PRIMARY events ref Unit ID,index_events_on_starts_at Unit ID 4 personnel_v2.units.id 17 50.00 Using where
1 PRIMARY ref 3 personnel_v2.events.id 20 100.00
2 DERIVED attendance index event and member,Event ID Event ID 3 608556 100.00

explain format=tree

-> Limit: 25 row(s)
    -> Sort: `events`.starts_at DESC, limit input to 25 row(s) per chunk
        -> Stream results  (cost=191083120.71 rows=1910650016)
            -> Nested loop left join  (cost=191083120.71 rows=1910650016)
                -> Nested loop inner join  (cost=2381.70 rows=3140)
                    -> Filter: ((units.ancestry like '1631/%') or (units.ancestry = '1631') or (units.id = 1631))  (cost=183.95 rows=360)
                        -> Index scan on units using index_units_on_ancestry  (cost=183.95 rows=1797)
                    -> Filter: (`events`.starts_at <= TIMESTAMP'2024-05-01 00:00:00')  (cost=4.36 rows=9)
                        -> Index lookup on events using Unit ID (unit_id=units.id)  (cost=4.36 rows=17)
                -> Index lookup on attendance_stats using <auto_key0> (event_id=`events`.id)
                    -> Materialize CTE attendance_stats  (cost=182879.55..182879.55 rows=608556)
                        -> Group aggregate: sum((attendance.attended = true)), count(attendance.id), sum(((attendance.attended = false) and (attendance.excused = false)))  (cost=122023.95 rows=608556)
                            -> Index scan on attendance using Event ID  (cost=61168.35 rows=608556)

explain analyze

-> Limit: 25 row(s)  (actual time=1005.935..1005.941 rows=25 loops=1)
    -> Sort: `events`.starts_at DESC, limit input to 25 row(s) per chunk  (actual time=1005.934..1005.939 rows=25 loops=1)
        -> Stream results  (cost=191087830.18 rows=1910650016) (actual time=743.997..990.440 rows=33976 loops=1)
            -> Nested loop left join  (cost=191087830.18 rows=1910650016) (actual time=743.979..966.377 rows=33976 loops=1)
                -> Nested loop inner join  (cost=7091.17 rows=3140) (actual time=1.510..195.187 rows=33976 loops=1)
                    -> Filter: ((units.ancestry like '1631/%') or (units.ancestry = '1631') or (units.id = 1631))  (cost=183.95 rows=360) (actual time=0.060..0.784 rows=1782 loops=1)
                        -> Covering index scan on units using index_units_on_ancestry  (cost=183.95 rows=1797) (actual time=0.054..0.421 rows=1797 loops=1)
                    -> Filter: (`events`.starts_at <= TIMESTAMP'2024-05-01 00:00:00')  (cost=17.44 rows=9) (actual time=0.007..0.108 rows=19 loops=1782)
                        -> Index lookup on events using Unit ID (unit_id=units.id)  (cost=17.44 rows=17) (actual time=0.007..0.107 rows=19 loops=1782)
                -> Index lookup on attendance_stats using <auto_key0> (event_id=`events`.id)  (actual time=0.000..0.001 rows=1 loops=33976)
                    -> Materialize CTE attendance_stats  (cost=183817.80..183817.80 rows=608556) (actual time=762.199..766.164 rows=29939 loops=1)
                        -> Group aggregate: sum((attendance.attended = true)), count(attendance.id), sum(((attendance.attended = false) and (attendance.excused = false)))  (cost=122962.20 rows=608556) (actual time=3.361..724.048 rows=29939 loops=1)
                            -> Index scan on attendance using Event ID  (cost=62106.60 rows=608556) (actual time=3.339..658.367 rows=609414 loops=1)
wilson29thid commented 3 months ago

@swomma had the good idea to cache the counts in the events table. We could use database triggers or the counter_culture gem for that.