Closed rob006 closed 5 years ago
Hi @rob006,
Thank you for taking the time to open a pull request and contribute to this cache drop-in.
Very grateful that you have updated the syntax and removed some code smell too. This was written way back when PHP 5.3 compatibility had to be maintained and used the WP code style from back then. That will explain a lot of the spacing array syntax improvements.
While I understand why you have added a local array cache to speed up repeated requests to the same resource in the same page load, it does introduce a problem elsewhere, that is the reason why I originally wrote this cache adapter to not have a local array cache. I was using back then an APC plugin that had a local array cache and I was running into issues with large sites and memory utilization, which had a knock-on effect with how many concurrent FPM processes we could run without running out server memory.
I run a number of large WordPress installations with close to 1M user accounts in them. When introducing an array cache to a large website where you are reading in a fair amount of data from the cache that can increase the PHP process memory footprint by a fair few MB per process. As the data is stored in an array that cannot be GC'd it utilises that entire memory for the length of the page load. It is not shared across processes so that memory usage is propped up and taken down every page load. When running high traffic load those few extra MBs per page load cause issues.
Another problem with the local array cache in WP land is that it is easy to exhaust the memory limit of a single process. Take this scenario: I want to loop over my users and set some user meta based on some arbitrary metric. To do that I would write something like this:
foreach($users as $user) {
update_user_meta($user->ID, 'some_key', 'some_value');
}
Because WP is trying to optimise its calls to the DB when every setting some user meta it will keep a cache of the WP_User object and a full copy of that user's meta, so the next time you want to call upon that meta it doesn't have to go to the DB and can lookup the cache and mind the meta in the array. As part of updating meta it has to first fetch the meta and if it doesn't exist in a cache, put it there. On a large website, a user object and all of its meta keys data can reach 100KB of data. Not thats not a lot but let's say I loop over 100 users I now have 10MB of additional memory usage in PHP because of the local array cache. Loop over 10K users and now you see the problem. You've now just fatal errored halfway through script execution because of memory exhaustion.
Another problem with the local cache implementation is that it doesn't respect TTL. If I set something to the cache for 5 seconds there is nothing in the local cache that respects that. If you have a long-running script and at the start, something goes in for 5 seconds and then a minute later I read that data back out or need to update it then the local cache is stale. In the real world, this is very unlikely to happen but it has been an issue for me in the past so I have to address it here.
I would love to merge this is but in its current state I think these things need to be addressed. I think a solution would be to have a simple way to switch the local cache on/off. For the majority of users having it on by default would be fine. But for large installations where the local cache can become problematic a way to turn it off would be good.
Also, it would be interesting to see the performance difference between doing 1M fetches from PHP array vs apcu_fetch to see really how well a local cache would improve things. As APCu data is stored in memory it should be just as fast...
@l3rady Which version of PHP you're using? To be honest, these memory problems seem to be pretty exotic for me - usually CPU power is drained before any memory-related issues come up.
In my case, these changes reduced page load time over 30% - this means that I need 30% fewer PHP processes to handle the same load (which means it also needs 30% less memory to handle the same load). I haven't noticed any significant increase in memory consumption. It may be an additional 1-2 MB, but with 20 busy PHP workers, an additional 40 MB of memory would the least of my problems ;).
Also, local cache is default WordPress behavior, so every drawback of local cache already exists in WordPress core and it does not seem to create any significant issues. Iterating over 10k users does not sound like usual task...
As APCu data is stored in memory it should be just as fast...
AFAIK for APCu data needs to be copied from shared memory to process memory. This may be expensive if you're doing this 2k times. For local data (in process scope) PHP may use COW, which is much more efficient, especially with large objects/arrays.
@rob006 in production I am running PHP7.2 and 7.3. While these memory problems may be exotic for you they certainly weren't for me and are very much real issues which lead to me dropping the local cache.
The local cache only speeds things up when WordPress calls the same data multiple times in a single page load. While this does happen it only happens a 1000 times or so (Don't quote me on this number, I'd like to do a test to see what the local cache hit rate is to get a better number). This local cache totally makes sense when talking to a database as you remove the time it takes to go over the network stack to a remote database, run a query and then return the result. The savings can be massive. But doing an apcu_fetch doesn't have the same overhead that a database call does. Putting in a local cache will only optimise those repeated calls and will only get you so many gains.
While it may in your situation given you a 30% load time reduction in my testing of local caches a while back, as the repeated calls got proportionally smaller to the non-repeated calls due to the website size the page load benefits of a local cache was only 5% at the cost of significantly higher memory foot-print and the issues explained earlier with memory exhaustion when doing complex admin tasks. I decided that a local cache caused more problems than it solved.
Just out of curiosity I decided to test your local cache version. I took a copy of one of our larger installations and loaded it locally. Running PHP 7.2 with APCu and OpCache. The database is remote.
I used the following crude code in index.php and tested just the home page as a logged-out user.
$start = microtime(true);
define('WP_USE_THEMES', true);
require( dirname( __FILE__ ) . '/wp/wp-blog-header.php' );
$time_elapsed_secs = microtime(true) - $start;
$data = [
'peak_memory_usage' => memory_get_peak_usage(),
'time_elapsed' => $time_elapsed_secs
];
dd($data);
Between each cache version, I restarted PHP-FPM which would flush out opcache and apcu cache.
So my observations are that your cache version seems to be quicker on the first load. I suspect that is due to the improvements you did with loading data into APCu. From second and third loads your cache provides very similar load times to mine, with repeated tests on average your cache provides a 3.5% reduction in load time but at the cost of 3.5% extra memory usage. I wonder though how much of that load time improvement is from just the code optimisations you've made and that of the local array cache 🤔
In a lightweight installation of WordPress I'm sure your local cache version would be more beneficial to load times but In the installations I'm running the improvements are marginal with some serious consequences in certain situations.
I'd be happy to merge in your improvements if you can add a switch to turn off local cache for where it doesn't improve performance.
I'd be happy to merge in your improvements if you can add a switch to turn off local cache for where it doesn't improve performance.
I'll try to take a look at this, but I can't promise anything. I already fixed this in my installation and this is "case closed" for me, just want to share my work, since other people seems to have similar issues.
But this may be more tricky than it sounds - current implementation is optimized for local cache (for example apcu_fetch()
is used when only apc_exists()
is needed), improving this may require more work than simple if
.
Testing this looks like a good speed improvement. I guess an environment variable or something to disable the array lookup would allow for the edge case mentioned above, with it being enabled by default for everyone else.
Looks like you've reformatted the whole file, which makes it rather difficult to look through the diff. But, on initial look, it appears you check for if array_key_exists
. You could maybe just change these checks to if !disabled && array_key_exists
.
Or, alternatively, just don't write anything into the array, and leave the other checks as is. e.g.
private function _add($key, $var, $ttl) {
if (apcu_add($key, $var, max((int) $ttl, 0))) {
if (!disabled)
$this->local_cache[$key] = is_object($var) ? clone $var : $var;
And then just get disabled from an environment variable or something.
Looks like you've reformatted the whole file, which makes it rather difficult to look through the diff.
While under the files changed tab if you click the gear icon you can set github to ignore whitespace changes. This will make seeing the changes much easier.
@Dreamsorcerer Yes a simple check can be added like you say but would also need adding to _set
, _add
, _incr
, _decr
and _get
I'll give a go shortly to add these checks
fix #6