[x] further analyses of the dreamFilters query, and any necessary indexes.
Notes for the future:
Multilingual support?
To make the best use of our indexes, and be explicit, we're using english as the text search dictionary, and maintaining an index for title || ' ' || description in english. As per the documentation, we should be able to actually support multiple languages:
It is possible to set up more complex expression indexes wherein the configuration name is specified by another column, e.g.:
CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector(config_name, body));
Which would mean that we'd need to store the language preference with the UserAccount entity, and propagate to each Dream row accordingly. The text search functions would then take this value as a parameter. This would be cool for our international users!
More indexes?
Currently, dream.user_id isn't indexed. Should we be adding that one? For my small test data, it doesn't seem like it's used (user size is ~3% of dream size) -- but it should speed up lookups by user id, probably?
Do we need a partial index on is_private? Not sure what the distribution would be, see https://hakibenita.com/sql-tricks-application-dba -- if there's fewer private dreams than public, maybe it's helpful?
A query like this would do the trick:
SELECT attname, n_distinct, most_common_vals, most_common_freqs
FROM pg_stats
WHERE tablename = 'dream' AND attname='is_private';
The linked article also mentions the notion of correlation, and how it affects the choice of an index scan vs. a more expensive operation. A correlation of 1 means that fewer pages may be read, and an index scan will likely be used. Here's how it looks for my test data (which doesn't reflect real trends!)
Will close #9
See notes at:
https://gist.github.com/lfborjas/2fd2d237d5600b392231ae2c472017bb
dreamFilters
query, and any necessary indexes.Notes for the future:
Multilingual support?
To make the best use of our indexes, and be explicit, we're using
english
as the text search dictionary, and maintaining an index fortitle || ' ' || description
in english. As per the documentation, we should be able to actually support multiple languages:Which would mean that we'd need to store the language preference with the
UserAccount
entity, and propagate to eachDream
row accordingly. The text search functions would then take this value as a parameter. This would be cool for our international users!More indexes?
dream.user_id
isn't indexed. Should we be adding that one? For my small test data, it doesn't seem like it's used (user size is ~3% of dream size) -- but it should speed up lookups by user id, probably?is_private
? Not sure what the distribution would be, see https://hakibenita.com/sql-tricks-application-dba -- if there's fewer private dreams than public, maybe it's helpful?A query like this would do the trick:
In my test data it of course yields:
The linked article also mentions the notion of correlation, and how it affects the choice of an index scan vs. a more expensive operation. A correlation of 1 means that fewer pages may be read, and an index scan will likely be used. Here's how it looks for my test data (which doesn't reflect real trends!)
References
Appendix: some illustrative query plans