For most queries, the query engine will add an active nodes filter to
exclude deactivated and expired nodes from the results. Without this
index, that filter will generate a Seq Scan of the certnames table. If
that table is relatively small, and the write-rate of the database is
slow enough, the certnames table is almost entirely in the buffer cache
and the filter is quite quick. But when testing with a large number of
nodes, which creates a high write rate, the updating of the
latest_report_id dirties the certnames buffer cache pages quickly. If
queries are issued infrequently, the filter clause can create a lot of
reads and writes, which slow it down dramatically. This creates highly
variable query performance, in testing on 100,000 nodes it was as fast
as 80ms on repeated queries, and as slow as 16 seconds when issues
infrequently.
-> Seq Scan on certnames (cost=0.00..82597.64 rows=1 width=10) (actual time=16869.675..16869.676 rows=0 loops=1)
Filter: ((deactivated IS NOT NULL) OR (expired IS NOT NULL))
Rows Removed by Filter: 100003
Buffers: shared hit=427 read=80322 dirtied=10 written=45739
Add an index to sidestep the certnames table entirely.
-> Index Only Scan using certnames_not_active_idx on certnames (cost=0.12..8.04 rows=1 width=10) (actual time=0.002..0.002 rows=0 loops=1)
Heap Fetches: 0
Buffers: shared hit=1
Not only did this stabilize the amount of time to query for active
nodes, it reduced the overall query time in a test install of 100,000
nodes from the at best 80ms above, to 0.4 to 1.6ms.
For most queries, the query engine will add an active nodes filter to exclude deactivated and expired nodes from the results. Without this index, that filter will generate a Seq Scan of the certnames table. If that table is relatively small, and the write-rate of the database is slow enough, the certnames table is almost entirely in the buffer cache and the filter is quite quick. But when testing with a large number of nodes, which creates a high write rate, the updating of the latest_report_id dirties the certnames buffer cache pages quickly. If queries are issued infrequently, the filter clause can create a lot of reads and writes, which slow it down dramatically. This creates highly variable query performance, in testing on 100,000 nodes it was as fast as 80ms on repeated queries, and as slow as 16 seconds when issues infrequently.
Add an index to sidestep the certnames table entirely.
Not only did this stabilize the amount of time to query for active nodes, it reduced the overall query time in a test install of 100,000 nodes from the at best 80ms above, to 0.4 to 1.6ms.