rails / solid_cache

A database-backed ActiveSupport::Cache::Store
MIT License
861 stars 60 forks source link

Do you recommend using a separate database from your application database as the cache store? #130

Closed flipsasser closed 7 months ago

flipsasser commented 9 months ago

Hi there! We are toying with putting this on staging and seeing how it works (and thank you to the creators, by the way - this is an excellent cost saver for those of us sick of paying through the nose for huge Redis instances!). Before we try it out, though, I thought it would be a good idea to check on a "recommended' setup.

No warranty implied, obviously; but I'm curious if anyone has real-world experience they could point to? Did anyone find that using their application database as a cache store caused a significant enough increase in load that it caused you issues? Was it something you found you could gradually move to, or is the infrastructure change something you'd recommend right away? Not a huge deal either way, but if any real-world users have any insight, I'd be super grateful!

I realize, of course, that no two production environments are identical. I'm just curious what other peoples' experience has been.

Thanks again for everything!

djmb commented 9 months ago

On HEY.com, we are using separate databases.

Worry about load was one of the reasons for that, but the cache database load is about 20% of the load on our primary database. So capacity-wise it would have been fine I think.

I don't have similar numbers for Basecamp, but its likely that it would be a closer call there as we generate a higher amount of cache traffic per request, and the overall traffic levels are higher.

What works will depend a lot on your cache query patterns. You should be able to track your read and write traffic from the Redis info command - you'll be adding a SELECT for each read and an INSERT for each write. The cache queries should all be very fast though - there's no index scans and the number of records returned is generally low.

A nice thing with a separate databases, is that you can tune them for reduced resilience and higher performance if you want (assuming that you have access to settings to do that).

If you do use the same database you might still want to set up a different connection pool pointing to it:

Finally if you use replica databases, cache writes will be subject to replication lag, which could lead to surprising results. We avoid this in HEY and Basecamp by using separate cache databases that are not replicated - instead we shard them for resilience. If we lose one database we only lose 25% of the cache.

flipsasser commented 9 months ago

Thank you so much for this guidance! I can't tell you how much I appreciate this thoughtful and helpful reply, that you were under no obligation to take time to write. You're a class act.

AxelTheGerman commented 1 month ago

I wonder if this deserves a small section in the README or maybe even just linking the issue comment. Very useful information for anyone setting up solid_cache in a project and having to decide which route to go.