Investigate postgres performance again

This comment illustrates an example of an exceptionally slow-running query.

Another extremely slow operation is

CREATE INDEX vw__idx_practice_by_chem ON vw__chemical_summary_by_practice USING btree (chemical_id varchar_pattern_ops, practice_id)

At times this is CPU-bound; when not, it's doing reading/writing at 100Mb/S.

Similary, NewRelic shows us that the slowest running query of late is:



SELECT
  pr.practice_id AS row_id,
  pc.name AS row_name,
  pc.setting AS setting,
  pc.ccg_id AS ccg,
  pr.processing_date AS date,
  SUM(pr.cost) AS actual_cost,
  SUM(pr.items) AS items,
  SUM(pr.quantity) AS quantity
FROM
  vw__chemical_summary_by_practice pr
JOIN
  frontend_practice pc
ON
  pr.practice_id=pc.code
WHERE
  (pr.chemical_id=%s )
  AND (pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s
    OR pr.practice_id=%s )
GROUP BY
  pr.practice_id,
  pc.code,
  date
ORDER BY
  date,
  pr.practice_id```

On the index creation example. This is a 68GB table with 28GB index.

When /mnt/database is being read (100 - 180 MB/s) its utlisation in iostat is at 100% and IO wait is therefore high.

Even though the table is large, and the read is quite slow, that should still take no more than 10 mins to read the entire thing, but the index creation took at least 2 hours (I think 3, in fact)
Autovacuum of the previously-created table happens at the same time
We sometimes see WALWriteLock events in the pg_stat_activity table. Are we seeing some kind of contention with users reading data? Investigate locking documentation. Seems like only a DROP TABLE will block a SELECT, so I don't think this is relevant, unless there is some parallelism I'm missing.
- DROP TABLE creates ACCESS EXCLUSIVE locks which will block everything else
- CREATE INDEX (non concurrently) creates SHARE lock which will block VACUUM, ANALYZE and CREATE INDEX (concurrently)
- COPY INTO creates ROW EXCLUSIVE locks which will block CREATE INDEX

Another data point: during the following operation, IO was at 100% (all read ops) and it took forever (i.e. well over 30 mins) - no index, so was doing a full table scan:

DELETE FROM vw__chemical_summary_by_practice WHERE processing_date = '2011-11-01';

Responding to a specific query from @richiecroker.

Seems that the following query is timing out:

/api/1.0/spending_by_ccg/?format=json&code=0408010AEBE,0408010AEBF,0408010AEBD,0408010AEBC

The pathological behaviour (resorting to a full table scan) appears related to the GROUP BY date clause.

This results in a sequence scan of the table:

EXPLAIN SELECT
  pr.pct_id AS pct_id,
  pr.processing_date AS date
FROM
  vw__presentation_summary_by_ccg pr
WHERE
  (pr.presentation_code LIKE '0408010AEBE%' OR
  pr.presentation_code LIKE '0408010AEBF%' OR
  pr.presentation_code LIKE '0408010AEBF%' 
)
GROUP BY
  pr.pct_id,
  date
LIMIT 1;

                                                                                           QUERY PLAN                                                                                            
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=2555605.18..2555605.19 rows=1 width=8)
   ->  Group  (cost=2555605.18..2555758.32 rows=14741 width=8)
         Group Key: pct_id, processing_date
         ->  Sort  (cost=2555605.18..2555656.23 rows=20419 width=8)
               Sort Key: pct_id, processing_date
               ->  Seq Scan on vw__presentation_summary_by_ccg pr  (cost=0.00..2554143.42 rows=20419 width=8)
                     Filter: (((presentation_code)::text ~~ '0408010AEBE%'::text) OR ((presentation_code)::text ~~ '0408010AEBF%'::text) OR ((presentation_code)::text ~~ '0408010AEBF%'::text))

But if we remove the GROUP BY ... date it uses an index scan:

EXPLAIN SELECT
  pr.pct_id AS pct_id,
  MAX(pr.processing_date) AS date
FROM
  vw__presentation_summary_by_ccg pr
WHERE
  (pr.presentation_code LIKE '0408010AEBE%' OR
  pr.presentation_code LIKE '0408010AEBF%' OR
  pr.presentation_code LIKE '0408010AEBF%' 
)
GROUP BY
  pr.pct_id
LIMIT 1;

                                                                                        QUERY PLAN                                                                                         
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..6729.87 rows=1 width=8)
   ->  GroupAggregate  (cost=0.57..3256983.04 rows=484 width=8)
         Group Key: pct_id
         ->  Index Scan using idx_vw__presentation_summary_by_ccg_on_code_and_pct on vw__presentation_summary_by_ccg pr  (cost=0.57..3256876.11 rows=20419 width=8)
               Filter: (((presentation_code)::text ~~ '0408010AEBE%'::text) OR ((presentation_code)::text ~~ '0408010AEBF%'::text) OR ((presentation_code)::text ~~ '0408010AEBF%'::text))

That was the plan; the actual is:

                                                                                                      QUERY PLAN                                                                                                       
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..6729.87 rows=1 width=8) (actual time=73279.001..73279.002 rows=1 loops=1)
   ->  GroupAggregate  (cost=0.57..3256983.04 rows=484 width=8) (actual time=73279.000..73279.000 rows=1 loops=1)
         Group Key: pct_id
         ->  Index Scan using idx_vw__presentation_summary_by_ccg_on_code_and_pct on vw__presentation_summary_by_ccg pr  (cost=0.57..3256876.11 rows=20419 width=8) (actual time=68634.368..73278.979 rows=17 loops=1)
               Filter: (((presentation_code)::text ~~ '0408010AEBE%'::text) OR ((presentation_code)::text ~~ '0408010AEBF%'::text) OR ((presentation_code)::text ~~ '0408010AEBF%'::text))
               Rows Removed by Filter: 1671264
 Planning time: 1.436 ms
 Execution time: 73279.107 ms

It is also interesting to compare the above two variations but with fewer conditions.

Grouping by date, with a single condition:

EXPLAIN ANALYZE SELECT
  pr.pct_id AS pct_id,
  pr.processing_date AS date
FROM
  vw__presentation_summary_by_ccg pr
WHERE
  (pr.presentation_code LIKE '0408010AEBE%' 
)
GROUP BY
  pr.pct_id,
  date
LIMIT 1;


                                                                                                        QUERY PLAN                                                                                                         
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1056045.23..1056045.24 rows=1 width=8) (actual time=6212.168..6212.169 rows=1 loops=1)
   ->  Group  (cost=1056045.23..1056096.28 rows=6080 width=8) (actual time=6212.166..6212.166 rows=1 loops=1)
         Group Key: pct_id, processing_date
         ->  Sort  (cost=1056045.23..1056062.25 rows=6807 width=8) (actual time=6212.163..6212.163 rows=1 loops=1)
               Sort Key: pct_id, processing_date
               Sort Method: quicksort  Memory: 388kB
               ->  Index Scan using idx_vw__presentation_summary_by_ccg_on_code_and_pct on vw__presentation_summary_by_ccg pr  (cost=0.57..1055611.87 rows=6807 width=8) (actual time=210.807..6207.190 rows=4161 loops=1)
                     Index Cond: (((presentation_code)::text ~>=~ '0408010AEBE'::text) AND ((presentation_code)::text ~<~ '0408010AEBF'::text))
                     Filter: ((presentation_code)::text ~~ '0408010AEBE%'::text)
 Planning time: 0.421 ms
 Execution time: 6212.292 ms
(11 rows)

And without the GROUP BY:


EXPLAIN ANALYZE SELECT
  pr.pct_id AS pct_id,
  MAX(pr.processing_date) AS date
FROM
  vw__presentation_summary_by_ccg pr
WHERE
  (pr.presentation_code LIKE '0408010AEBE%'
)
GROUP BY
  pr.pct_id
LIMIT 1;

                                                                                                    QUERY PLAN                                                                                                    
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.57..2181.66 rows=1 width=8) (actual time=145.891..145.892 rows=1 loops=1)
   ->  GroupAggregate  (cost=0.57..1055650.74 rows=484 width=8) (actual time=145.889..145.889 rows=1 loops=1)
         Group Key: pct_id
         ->  Index Scan using idx_vw__presentation_summary_by_ccg_on_code_and_pct on vw__presentation_summary_by_ccg pr  (cost=0.57..1055611.87 rows=6807 width=8) (actual time=123.015..145.872 rows=17 loops=1)
               Index Cond: (((presentation_code)::text ~>=~ '0408010AEBE'::text) AND ((presentation_code)::text ~<~ '0408010AEBF'::text))
               Filter: ((presentation_code)::text ~~ '0408010AEBE%'::text)
 Planning time: 0.345 ms
 Execution time: 145.967 ms

I presume the reason it uses a sequence scan is because the date is not in the index.

Next steps is to benchmark our read iops. We should also consider changing the index on that table to cover date, too.

iops

The index being scanned is 342MB; the table size is 10GB.

According to iozone tests, the average read throughput is 1429mB/sec and the average write throughput is 509mB/sec, so that seems fine (note that the "local" disk rather than the "mounted" storage gets approx 200% read and 150% write).

Copying a 20GB file from /mnt/database to /tmp/ took 2m 13s and made iowait go up to 25%; the total copy rate was around 162mB/sec.

This means scanning a 10GB table is usually going to take more than a minute, and disk performance is not really an issue

sequence scan

The fix must be to avoid the sequence scan, somehow .

Separate indexes for chemical, presentation

We generally allow search on section, paragraph, chemical, or product, or "product format" (presentation). In this case, the table in question is selected for queries against product or presentation.

We may find an improvement if we don't use LIKE queries, but instead index on product and presentations separately:

CREATE INDEX  idx_products ON vw__presentation_summary_by_ccg(substring(presentation_code from 1 for 11));

Then:

SELECT * from vw__presentation_summary_by_ccg WHERE substring(presentation_code from 1 for 11) = '0408010AEBF'

We could also try a GIN index:

create index test_trgm_gin on test using gin (some_text gin_trgm_ops);

However, creating a GIN index can be very slow.

This is the clearest, most pertinent documentation I've seen regarding the problem.

First we have to note that an OR query has to involve multiple bitmap index scans. From that it follows that the repeated OR query portion has to be very fast for the overall operation to be performant.

A simple query against the bnf code was very slow, because there was only one compound index defined with the presentation code as the second column. This required an index scan, which is relatively slow.

Presumably the loss of this index has contributed to this problem.

I have added...

CLUSTER vw__presentation_summary_by_ccg USING idx_vw__presentation_summary_by_ccg_on_code_and_pct

...which takes about 19 min to build, and greatly improves performance

ebmdatalab / openprescribing