rails / solid_cache

A database-backed ActiveSupport::Cache::Store
MIT License
867 stars 61 forks source link

Suggestion: provide doc about DB config for read performance #195

Open PikachuEXE opened 2 months ago

PikachuEXE commented 2 months ago

If users don't know how to make the read performance faster via config (besides gem config) it's hard to even try it out

I am using PSQL but there is no doc for MySQL too (let's include doc for both?)

andyatkinson commented 1 month ago

@PikachuEXE I'm working on a basic blog post integrating Solid Cache with a Rails + Postgres app. What would you like to see on read performance?

For example, do you mean for solid_cache_entries, identifying any missing indexes? I did a quick check of the schema and indexes, but haven't yet studied the queries much.

Did you have other read operation slowness or optimizations in mind?

PikachuEXE commented 1 month ago

I think this gem is quite new to people and there are mainly 2 cases (at first)

Case (1) could be more difficult to "optimize" (I generally avoid doing that even in redis) so I expect some info for (2) Which would be what kind of DB config (or what should be considered) would help improve read performance (info for write performance is less important but welcome)

The schema and indexes shouldn't be a concern for users (provided by gem and should not be different per user)

I have asked on reddit before and you can check some of the answers for what I expect Though I wouldn't agree with use unlogged tables for solid_cache

AxelTheGerman commented 1 month ago

how to make the read performance faster

Why? Are there any performance issues out of the box? If its really about tweaking and maximizing then it might depend a whole lot on your individual set up, e.g. DB version, CPU/RAM/disk performance, number of instances, usage patterns etc

andyatkinson commented 1 month ago

@AxelTheGerman Agree best to hold off until there's significant data and query volumes for solid cache data, before getting into database tuning. Even then, it will depend on the available system resources and how to allocate them, including memory, CPU etc.

@PikachuEXE With that said above, I've published the post I mentioned earlier with the basics of Solid Cache and PostgreSQL, and did get into some generic server instance tuning guidance, considering read and write optimizations. I'm not intending for this to be directly related to your issue here as documentation for solid cache, in part because it's specific to PostgreSQL, however I wanted to share it for you or anyone else that's interested in exploring these kinds of optimizations to see what's possible. I'm happy to receive any feedback or questions. https://andyatkinson.com/solid-cache-rails-postgresql

I'm eager to see how solid cache is adopted, and to learn about big deployments with large amounts of data and query volume in cache.

PikachuEXE commented 1 month ago

@andyatkinson Great article for beginners (as gem users) :) Though there are some places with seems to be mistakes (but I can be wrong)

As discussed earlier, we could disable write ahead logging (WAL) for the solid_cache_entries table to reduce arguably unnecessary write IO related to cache entries. If Postgres were to restart, this means solid_cache_entries will be truncated, so keep that in mind. Keep the table logged if you need to be guaranteed this data will exist following a restart.

From https://www.postgresql.org/docs/current/sql-createtable.html

If specified, the table is created as an unlogged table. Data written to unlogged tables is not written to the write-ahead log (see Chapter 30), which makes them considerably faster than ordinary tables. However, they are not crash-safe: an unlogged table is automatically truncated after a crash or unclean shutdown. The contents of an unlogged table are also not replicated to standby servers. Any indexes created on an unlogged table are automatically unlogged as well.

I think If Postgres were to restart is not clear enough to represent a crash or unclean shutdown (not normal shutdown)

Keep the table logged if you need to be guaranteed this data will exist following a restart. ALTER TABLE solid_cache_entries SET UNLOGGED;

Do you mean ALTER TABLE solid_cache_entries SET LOGGED; here?

I think there are more possible optimizations like longer checkpoint distance & synchronous_commit = off (from https://www.cybertec-postgresql.com/en/reduce-wal-by-increasing-checkpoint-distance/#increasing-checkpoint-distances) can be mentioned

andyatkinson commented 1 month ago

Hi @PikachuEXE. I rushed this out a bit, so I appreciate you reading it and providing corrections and additions. I've incorporated your feedback and credited you at the end of the post. I'll take one more pass over it, but here's a summary of changes:

Thanks!