laravel / ideas

Issues board used for Laravel internals discussions.
939 stars 28 forks source link

[Feature] Cache stampede protection #1733

Open ejunker opened 5 years ago

ejunker commented 5 years ago

When a cache key expires you may run into a situation where there are multiple requests for a cache key which causes multiple processes to try to calculate the cache value. This is commonly referred to as a "stampede". I noticed that other cache implementations have stampede protection and I think it would be a good addition to Laravel. I was primarily thinking Cache::remember() could somehow place a lock or mutex while the closure is running and respond with stale data if it is available or wait until the closure finishes running and has the result. Maybe \Illuminate\Cache\Lock could be used.

https://blog.tedivm.com/rants/2014/10/a-walkthrough-of-psr-6-caching/

When implementing stampede protection, for instance, there is often a lock or flag that is set to let other processes know that one is working on the refresh so they don’t also do it and overload the system. In cases where there’s an error and the flag isn’t reset (such as when an exception is thrown by the refresh code) it can have serious repercussions for the system as all of those processes wait for that refreshed value. With the Pool/Item model the developer can put something to clear the lock right in the Item’s destructor so it gets cleared right when the item is out of scope.

https://github.com/matthiasmullie/scrapbook/blob/master/src/Scale/StampedeProtector.php

This class is designed to counteract that: if a value can't be found in cache we'll write something else to cache for a short period of time, to indicate that another process has already requested this same key (and is probably already performing that complex operation that will result in the key being filled)

All of the follow-up requests (that find that the "stampede indicator" has already been set) will just wait (usleep): instead of crippling the servers by all having to execute the same operation, these processes will just idle to give the first process the chance to fill in the cache. Periodically, these processes will poll the cache to see if the value has already been stored in the meantime.

The stampede protection will only be temporary, for $sla milliseconds. We need to limit it because the first process (tasked with filling the cache after executing the expensive operation) may fail/crash/... If the expensive operation fails to conclude in < $sla milliseconds. This class guarantees that the stampede will hold off for $sla amount of time but after that, all follow-up requests will go through without cached values and cause a stampede after all, if the initial process fails to complete within that time.

tedivm commented 5 years ago

Stash allows developers to handle stampedes in a number of ways-

  1. Ignoring it and letting them happen,
  2. Identifying when values are close to expiring and recalculating before it occurs,
  3. Placing a lock and only returning old data until the lock is released or times out (so one process calculated the new data, and the rest return the old).
  4. Placing a lock and sleeping, periodically waking to check to see if the data is there (for obvious reasons this method is only recommended for CLI applications).

My recommendation is to either expose the options to developers so they can pick, or go with option number two and attempt to recalculate before the expiration time actually hits.

Also, don't underestimate how big of a deal this is- when one of my former clients (one of the top five adult sites in the US) switched to the precompute method they were able to drop the number of servers their search service was using by 20%. The dogpile/stampede issues can be a huge performance loss.

ejunker commented 5 years ago

Just discovered that Symfony has cache stampede protection https://symfony.com/doc/current/components/cache.html and it looks like it now uses "Probabilistic early expiration" by default.

tedivm commented 5 years ago

They use probabilistic early expiration (option two above), but they also implement locking by default as well-

The first solution is to use locking: only allow one PHP process (on a per-host basis) to compute a specific key at a time. Locking is built-in by default, so you don't need to do anything beyond leveraging the Cache Contracts.

Their documentation doesn't make it clear how they respond to locks though, and I imagine with the precompute option it's not something they run into regularly anyways.

akalongman commented 5 years ago

They wrote the blog post about it https://symfony.com/blog/new-in-symfony-4-2-cache-stampede-protection

hubertnnn commented 5 years ago

I wouldnt make this a default behavior, but rather an extra option. Maybe as a proxy/decorator driver. Then you could have all options above as separate proxies.

fgilio commented 4 years ago

Would love to have this built in into the framework