mchung / heroku-buildpack-wordpress

Heroku buildpack: Wordpress on Heroku
mchung.github.com/heroku-buildpack-wordpress/
388 stars 333 forks source link

Problem with cache and urls with query strings #13

Closed luisherranz closed 11 years ago

luisherranz commented 11 years ago

Hi mchung,

I'm doing some tests and it seems like it's caching urls which it shouldn't.

If you load http://www.mydomain.com/?a=1 and then you turn off the database, you still can load http://www.mydomain.com/?a=1 http://www.mydomain.com/?a=2 http://www.mydomain.com/?a=3... which obviously you shouldn't, because urls with queries aren't supposed to be cached.

Can you confirm it?

Best, Luis.

mchung commented 11 years ago

Hi Luis-

Can you explain what's supposed to happen..

luisherranz commented 11 years ago

Sure,

  1. To know if a page is cached or not, I visit it and then turn off the database. If it loads properly, it's cached. If I get an "Error establishing connection with the database" it's not.
  2. Look at these lines from the original wordpress.conf.erb: # POST requests and urls with a query string should always go to PHP if ($request_method = POST) { set $cache_uri 'no cache'; } if ($query_string != "") { set $cache_uri 'no cache'; }

The second if means that nginx shouldn't cache urls with query strings. This is the default Wordpress behavior when you use cache plugins like WP SuperCache or W3 TotalCache.

It is useful when your Wordpress uses queries to do php things, like an affiliates system which uses a tag to identify each affiliate: ?aff_id=X.

If you don't need it you can always comment those lines, but now they are not working.

luisherranz commented 11 years ago

Ok, there's a simpler method to check this:

Load www.mydomain.com and look at the source. In the header you should see something like: <!-- generated in 1.398 seconds 59036 bytes batcached for 300 seconds -->

If you reload it you should see something like this: <!-- generated 28 seconds ago generated in 1.398 seconds served from batcache in 0.020 seconds expires in 272 seconds -->

Now, if you load something with a query string, like www.mydomain.com/?something=1, it shouldn't generate a cache, but it does. You still get: <!-- generated in 1.279 seconds 59078 bytes batcached for 300 seconds --> And if you reload: <!-- generated 23 seconds ago generated in 1.279 seconds served from batcache in 0.017 seconds expires in 277 seconds -->

For some reason, it is caching urls with query strings.

By default, batcache shouldn't do that. This is from the batcache documentation:

Exemptions # Note that URLs with query strings are automatically exempt from Batcache. This can be undesirable in many cases as popular pages linked to with query strings can significantly reduce the effectiveness of our caching setup and can affect the overall performance of your site.

And with the current wordpress.conf.erb configuration it shouldn't either: # POST requests and urls with a query string should always go to PHP if ($request_method = POST) { set $cache_uri 'no cache'; } if ($query_string != "") { set $cache_uri 'no cache'; }

I'm doing tests, trying to change things but I don't seem to find the problem.

luisherranz commented 11 years ago

I have removed these lines: # Don't use the cache for logged in users or recent commenters if ($http_cookie ~* "comment_author|wordpress_[a-f0-9]+|wp\-postpass|wordpress_logged_in") { set $cache_uri 'no cache'; } but even with that it doesn't use cache for logged in users, so this is not working at all.

I think I am completely lost here.

luisherranz commented 11 years ago

Ok, so the lines referring cache on the wordpress.conf.erb doesn't have any effect.

I've finally made it work editing the $args nginx variable and advanced-cache.php. Maybe it's not the best way to deal with this, but it is working fine so far.

I will try to explain it here:

mchung commented 11 years ago

Luis, can you visit the page from a new browser session, say incognito mode (never logged in, no cookies, etc). In my early testing, I also came across this weird behavior where basically the caching only applied to visitors. All the caching would be skipped when the user was logged in to the admin site.

luisherranz commented 11 years ago

mchung, it's working fine here, it never caches when I'm logged in.

Can you look in the code? If the page is cached you should see something like this: <!-- generated 4 seconds ago generated in 1.358 seconds served from batcache in 0.012 seconds expires in 296 seconds -->

mchung commented 11 years ago

Luis- What did you want me to look into?

luisherranz commented 11 years ago

I think I misread your last post, sorry. Everything is working fine here.