endlessm / azafea

Service to track device activations and usage metrics
Mozilla Public License 2.0
10 stars 2 forks source link

Consider BRIN indexes on created_at columns #174

Open wjt opened 2 years ago

wjt commented 2 years ago

We have many very large tables with a creation date column. The obvious example is ping_v1, which has 42 million rows at the time of writing and a created_at column recording the time the ping was received. There is, of course, an index on created_at. Empirically, however, postgresql only uses it for ranges below about 4½ months; beyond that it prefers to perform a seq scan of the whole table, which takes almost a minute.

During a random walk through the PostgreSQL documentation (I need better hobbies) I stumbled upon BRIN indexes.

BRIN stands for Block Range Index. BRIN is designed for handling very large tables in which certain columns have some natural correlation with their physical location within the table. […] For example, a table storing a store's sale orders might have a date column on which each order was placed, and most of the time the entries for earlier orders will appear earlier in the table as well[.]

BRIN indexes can satisfy queries via regular bitmap index scans, and will return all tuples in all pages within each range if the summary info stored by the index is consistent with the query conditions. The query executor is in charge of rechecking these tuples and discarding those that do not match the query conditions — in other words, these indexes are lossy. Because a BRIN index is very small, scanning the index adds little overhead compared to a sequential scan, but may avoid scanning large parts of the table that are known not to contain matching tuples.

created_at actually comes from the timestamp at which eos-activation-server received the ping, and rows of this table may be written by the 2 instances of the azafea daemon pulling from redis in parallel, so it is not guaranteed that rows in physical order are also in created_at order. However they are very strongly correlated. Assuming that the id field correlates exactly with physical order, of the 28771 rows in the past hour at the time of writing, only 63 are not in ascending order when sorted by created_at. Across the entire table of 42 million rows, 91 304 are out of order, by a total of offset 72.

So I believe that switching the created_at index to be a BRIN index would allow it to be used in more cases. It's worth a try at least.

According to the documentation some special maintenance is needed for BRIN indexes though.