pressflow / 6

Each version of Pressflow is API-compatible with the same major Drupal version. For example, Pressflow 6 is compatible with all Drupal 6 modules. Pressflow 6 also integrates the SimpleTest system from Drupal 7 and the CDN support patch.
http://pressflow.org/
GNU General Public License v2.0
234 stars 89 forks source link

Remove recursive locking from variable caching #52

Open mike-potter opened 12 years ago

mike-potter commented 12 years ago

The current Pressflow6 has a recursive call to variables_init trying to obtain a lock. It has a bailout at 50. But when high burst traffic is received by a site, this recursion causes very long response times on sites that do a lot of variable setting. Each recursive call has a 30sec timeout. Adding a watchdog to our site shows recursion levels >30 deep during some bot bursts. Since this is called at shutdown, it causes many worker threads to stay busy and ultimately causing 503 varnish timeout errors on a site.

The attempt of this patch is to add some ideas from the following Drupal.org threads: http://drupal.org/node/973436 , http://drupal.org/node/1595070. These threads have interesting discussion of the problem, but tend to get sidetracked into D8 implementation. Because this is a very serious problem for a couple of our sites, I wanted to bring a more practical solution back to the table for Pressflow users. Porting this patch to Drupal7 should be straight-forward.

My approach is to completely remove the lock recursion. After a variable_set or variable_del, a static variable is marked, and at shutdown it will attempt to acquire a lock to write-thru the Drupal cache. However, if the lock cannot be easily acquired, it simply clears the cache. This potentially adds delay to the next request where variable_init will need to rebuild the cache, but that response time is better capped them any sort of recursion lock delay.

I've done jmeter benchmarks on a site with and without this patch and on normal sites I don't see any noticeable differences. However if I construct a test page that does a lot of variable_sets, I find the performance of that page to be significantly better with this patch (factor of 2 speed boost) and I no longer get any busy/locked worker threads and no longer experience the very long response times that led to 503 errors.

mike-potter commented 12 years ago

Note that this also addresses the issue in the pull request: https://github.com/pressflow/6/pull/37

(Also, sorry for all the whitespace corrections...My editor automatically removes extra whitespace)