AOEpeople / Aoe_ClassPathCache

Class path cache for Magento autoloader
http://www.fabrizio-branca.de/magento-class-path-cache.html
Open Software License 3.0
49 stars 16 forks source link

Cannot clear cache behind load balancer #4

Open colinmollenhour opened 11 years ago

colinmollenhour commented 11 years ago

The current method of clearing the APC cache only works well for single-server setups. If the website is run on multiple nodes behind a load balancer then the current method cannot clear the cache on all nodes.

I suggest instead using a filesystem flag. E.g. when loading the APC cache attempt to delete a file named var/classpathcache.flag and if the delete fails then load from APC cache. To prevent another process from storing stale data, rather than immediately delete the APC cache just overwrite it with a new one.

E.g.:

if ( ! @unlink(Mage::getBaseDir('var').DS.'classpathcache.flag')) {
  // load apc_fetch results
}
fbrnc commented 11 years ago

Hi Colin,

I appreciate your feedback. You're perfectly right, clearing the cache won't work if you have more than one server behind a load balancer. But depeding on your setup forcing Aoe_ClassPathCache to use the file system won't work either. A clean multi-server setup shouldn't rely on sharing any files in var/ between the nodes. In this case the same problem would still persist.

On the other hand I fully agree that there should be a way to force Aoe_ClassPathCache to use the file system and not to use apc automagically just because it's there.

In the solution you're suggesting you would manually create var/classpathcache.flag to enable storing the cache in the file system, right? I like the idea of still having APC the default, but controlling this by the fact that a non-existing file cannot be deleted while suppressing errors doesn't look very clean to me.

Why do you think about controlling the cache storage using an environment variable (similar to developer mode) or by a static method that can be called in index.php (yes, I hate modifying core files, but this might still be a better solution over suppressing errors...)

Have a good day,

Fabrizio

colinmollenhour commented 11 years ago

I did not intend that the var directory would be shared between app nodes. The intention behind my suggestion is that some external script would touch the file on each machine. E.g.:

#!/bin/bash
for node in 'app1 app2 app3'; do
  ssh $node 'touch /var/www/magento/var/classPathCache.flag'
done

Also, I don't have an issue with using APC, my suggested approach is to use APC for cache storage and filesystem for notifying the next request when the cache needs to be cleared. Of course some people might prefer not to use APC at all, but considering the cache record size is so small I don't see the harm in it. Perhaps you could just add a method for one to disable APC if they want to modify the core files but the default would be enabled and no core files modified.

danielpoe commented 11 years ago

Files are only changed during deployment - aren't they? So clearing this cache can be part of the deploymentprocess...

Cheers

2013/6/10 Colin Mollenhour notifications@github.com

I did not intend that the var directory would be shared between app nodes. The intention behind my suggestion is that some external script would touch the file on each machine. E.g.:

!/bin/bash

for node in 'app1 app2 app3'; do ssh $node 'touch /var/www/magento/var/classPathCache.php' done

Also, I don't have an issue with using APC, my suggested approach is to use APC for cache storage and filesystem for notifying the next request when the cache needs to be cleared. Of course some people might prefer not to use APC at all, but considering the cache record size is so small I don't see the harm in it. Perhaps you could just add a method for one to disable APC if they want to modify the core files but the default would be enabled and no core files modified.

— Reply to this email directly or view it on GitHubhttps://github.com/AOEmedia/Aoe_ClassPathCache/issues/4#issuecomment-19214839 .

colinmollenhour commented 11 years ago

Yes, the cache needs to be cleared only during deployment. Not sure how that relates to this ticket though..

fbrnc commented 11 years ago

Hi Colin,

I didn't realize that you wanted to create the flag so that the next request would clear the cache (and also delete the flag, right?). That sounds smart to me. Then we should actually add two flags (one for each environment) to indicate that the cache should be invalidated. Actually I like this approach much more and I'm thinking of removing that frontend controller and the wrapper method that will than do a request to the frontend controller instead.

But coming back to your use case. You could also do something like this (again, with the frontend controller):

#!/bin/bash
for node in 'app1_ip app2_ip app3_ip'; do
  curl -H "Accept-Encoding: gzip, deflate" -H "Host: YOURACTUALHOSTNAME" -s -X GET -I http://$node/SECRET_URL_GENERATED_BY_AOE_CLASSPATHCACHE
done

This is what we do to warm up multiple varnish servers behind a shared load balancer. Sending the original hostname as a separate header and then addressing the frontend server by IP works fine. Of course this solution requires the script to have direct access to the frontend nodes (which the deployment script usually has...)

colinmollenhour commented 11 years ago

Ahh, I didn't think of using curl with the Host header, that works very well, perhaps better than the filesystem flag.

One issue I noticed with the filesystem mode (haven't used APC mode yet) is the thrashing that occurs when the cache is cleared. I wonder if rather than clearing the cache it could just be revalidated? For example:

  1. Load the existing cache.
  2. Loop through every cached class name and re-discover the path. Update/remove as needed.
  3. Save updated cache (or do nothing if nothing was updated/removed).

The above would help both the filesystem and APC-backed methods equally.

colinmollenhour commented 11 years ago

Just submitted patch that does the revalidation as described above. (1dad315)

rgoytacaz commented 11 years ago

If using APC, forcing an APC cache clear on deployment would also do the trick right?

colinmollenhour commented 11 years ago

The idea is to prevent the thrashing that occurs when you clear the cache because process 1 loads a frontend page, process 2 loads a backend page, process 3 serves an API request, process 4 loads a checkout page, etc.. and they are all overwriting each others records until after many many many overwrites you finally end up with a stable cache records that has all of the class names in it. Now if you delete this record it starts all over again so simply revalidating it is a much friendlier approach, especially if using the filesystem. Once you reach a certain level of traffic the approach of clearing every last cache entry is no longer something you can just do at a whim anymore.

mklooss commented 10 years ago

If build a simple script to revalidate. and open this on every server

<?php

if($_SERVER['REMOTE_ADDR'] != 'backendIP')
{
    echo "no lookie lookie here";
    exit;
}

require_once 'app/Mage.php';
umask(0);
Mage::app("admin");
Mage::helper('aoe_classpathcache')->revalidateCache();
LeeSaferite commented 9 years ago

@fbrnc Should this still be open?