apache / incubator-pagespeed-mod

Apache module for rewriting web pages to reduce latency and bandwidth.
http://modpagespeed.com
Apache License 2.0
696 stars 158 forks source link

LoadFromFile + memcached -> meta-data cache timestamp problems #488

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Turn on ModPagespeedLoadFromFile.
2. Enable memcached (or NFS for cache directory, or anything else that will 
share the meda-data cache between servers).
3. Push a file out to both servers with non-identical timestamp.

What is the expected output? What do you see instead?

mod_pagespeed should serve rewritten versions from both servers continuously, 
checking timestamp each time and realizing that it is still valid.

Instead, it will flap back and forth each server noticing that the timestamp 
changed and then clobbering the old meta-data cache entry.

Original issue reported on code.google.com by sligocki@google.com on 29 Aug 2012 at 1:25

GoogleCodeExporter commented 9 years ago
I discussed with Josh and Maks offline and we came up with several possible 
solutions to this:

1) Stop checking for exact mtime match, instead check that the file is not 
newer than the cached mtime. Thus if we had N servers, we would rewrite the 
file at most N times and on average N/2 times. Downside: If you have serious 
server time skew or try to revert a file to an older timestamp, we may serve 
stale content.

2) Configure a range around stored mtime that should be considered equal. Say 
+/- 1 minute, this would account for slight differences from clock skew or 
replication time. Thus the resource would only be rewritten once ideally. 
Downside: You'd have to configure this to fit your situation, how long your 
replication takes / how bad your clock skew is. Otherwise your system would 
thrash just like now.

3) Store list of (machine_identifier, mtime) pairs in cache so that the mtimes 
are only interpreted by the machine that wrote them. The resource would be 
rewritten by each server (We could probably optimize this to rewritten once). 
Downside: We need to store a bunch more server-specific info in the cache, 
cache update could have weird race-conditions, especially under high load that 
would lead to the resource being rewritten any number of times.

4) Stop storing mtime in the cache. Instead store input hash in the cache (or 
something similarly identifying) and store the mtime in a separate local server 
cache. This would keep the speed of checking a file to a single stat of that 
file and a local meta-data cache lookup, while also allowing us to share the 
results of a rewrite between servers and we wouldn't need to store a bunch of 
server specific stuff in memcached. Downside: We need to add a separate local 
server cache.

5) Disallow memcached and LoadFromFile together. Just don't let people do this. 
Downside: We are advocating the use of both memcached and LoadFromFile, 
annoying that you couldn't use them together.

Original comment by sligocki@google.com on 29 Aug 2012 at 2:13

GoogleCodeExporter commented 9 years ago
As a point of user feedback, I personally would rather deal with the 
consequences of #1 or #2 than #5. At least those two are something I can 
control and mitigate on our end as long as I know what's going on. 
Additionally, as more servers are added to the pool, I think it becomes more 
reasonable to expect (and document) the need for files and server times to be 
in sync across the pool.

#3 and #4 sound great too of course, but it also sounds like significantly more 
work on your end.

Original comment by amat...@gmail.com on 29 Aug 2012 at 4:27

GoogleCodeExporter commented 9 years ago
A note on my particular use case

- servers are all on NTP UTC time

- standard deployment process distributes a zip file containing site files 
across current active servers, which each independently unpack the files. the 
contents are the same but mtime will have some spread +/- 1 minute

- using autoscaling on amazon which means that servers will come and go over 
the course of hours and days. when they initialize they retrieve the zip 
package with the files for the site. contents will be the same as on the other 
servers but will have a much newer mtime.

for my use case,
1) would solve the initial deployment and work for new servers, but we'd lose 
the big advantage of memcache because new servers need to rewrite everything 
when they come online
2) would solve the initial deployment, but would still have a thrash situation 
when new servers come on
3) 'weird race conditions' sounds unpleasant
4) would be ideal, but sounds like a lot of work
5) would be the gordian knot approach, but would be a drag

Original comment by jon.mars...@englishcentral.com on 29 Aug 2012 at 4:59

GoogleCodeExporter commented 9 years ago
What are you using to deploy the .zip file? zip/unzip should normally preserve 
timestamps (with one second accuracy, which isn't good enough for current code, 
but is workable). 

Original comment by morlov...@google.com on 29 Aug 2012 at 5:12

GoogleCodeExporter commented 9 years ago
We're using a fairly esoteric format called ZPK which is used by Zend Server, 
an enhanced and proprietary Linux-Apache-PHP-MySQL server.
http://www.zend.com/en/products/server/getting-started/application-deployment

It's essentially a zip file with some special manifest contents. The guts of 
the deployment process appear to unpack the zip to a /tmp folder, then copy the 
unpacked files to the deployment location. The mtimes I have appear to match 
when the actual copy happened on the web server.

Hash contents should be identical, but mtimes vary.

Original comment by jon.mars...@englishcentral.com on 29 Aug 2012 at 5:24

GoogleCodeExporter commented 9 years ago

Original comment by jmara...@google.com on 4 Sep 2012 at 6:33

GoogleCodeExporter commented 9 years ago
Another possible hybrid solution:

6) Store an mtime and a content hash in the cache.  If mtime is newer than 
ours, use optimized data.  If ours is newer, compute the content hash and 
either update the mtime, or re-optimize if the hash doesn't match.  Haven't 
thought about how this interacts with locking etc. though as we usually lock 
the resource for re-optimization before fetching it.  (Actually, I'm not sure 
if we do cross-machine locking for memcached deployments, or just have every 
cache miss potentially result in a local re-optimization; I think we may 
presently do the latter, which might increase startup load but save a vast 
amount of code complexity / fault tolerance.)

Original comment by jmaes...@google.com on 4 Sep 2012 at 8:08

GoogleCodeExporter commented 9 years ago

Original comment by jmara...@google.com on 5 Sep 2012 at 1:31

GoogleCodeExporter commented 9 years ago

Original comment by jmara...@google.com on 6 Sep 2012 at 7:55

GoogleCodeExporter commented 9 years ago
The fix is in:
You can specify a server-private "filesystem metadata cache" that stores the 
server's timestamp rather than it being stored in the metadata cache.
Each time the server wants the file, it stat()s it to get its current mtime and 
checks that against the value stored in its filesystem metadata cache.
If they're the same, it gets its idea of the content hash from the metadata 
cache then compares that to the content hash in the metadata cache.
If they're the same, then the metadata cache entry for the file is current and 
can be used, as can the rewritten contents in the HTTP cache.
If either of the values are different or missing, the file is re-read, its 
content hash is recomputed, the filesystem metadata cache is updated with the 
new values, and the checks are performed again.
If after all this the 2 content hashes are still different/missing, the 
contents of the metadata cache and HTTP cache are out-of-date and a rewrite is 
initiated (which ultimately results in the metadata and HTTP caches being 
updated with the latest contents).

If a different server with different contents for the file then goes through 
these steps, the "new" values will be overwritten again by the "old" values, 
but this is unavoidable and will stop once the file contents are the same.

Full documentation is coming but the new directive is:
  ModPagespeedFilesystemMetadataCache value
where value is either a memcached server on localhost, or the literal value 
'memcached' which will use the first memcached server in 
ModPagespeedMemcachedServers on localhost.

For example:
  ModPagespeedMemcachedServers memcachedserver1:6765,memcachedserver2:6765
  ModPagespeedFilesystemMetadataCache localhost:6765
or
  ModPagespeedMemcachedServers memcachedserver1:6765,localhost:6765
  ModPagespeedFilesystemMetadataCache memcached

This fix does NOT currently handle the case where your metadata cache is on a 
shared disk (NFS/NAS), but I believe there's no technical reason it won't work, 
it just hasn't been tested yet.

Original comment by matterb...@google.com on 25 Oct 2012 at 2:37

GoogleCodeExporter commented 9 years ago
Correction: the memcached value can only be used if ALL the servers in 
ModPagespeedMemcachedServers are on localhost.

HOWEVER, DO NOT USE THIS VALUE (memcached) as it will removed shortly because 
it actually results in the original broken behavior (I think).

Original comment by matterb...@google.com on 25 Oct 2012 at 5:45

GoogleCodeExporter commented 9 years ago

Original comment by j...@google.com on 26 Oct 2012 at 3:02

GoogleCodeExporter commented 9 years ago
Yet another update, hopefully good news.

I recently submitted a mod to this change that removed the new directive and 
automatically configures the filesystem metadata cache if/as required.

This is possible because we now reuse the metadata cache (MDC) for the 
filesystem metadata cache (FSMDC) but we prefix entries in the FSMDC with the 
server's hostname, meaning that multiple servers can share the cache without 
stomping on each other. Since the FSMDC is only needed when using memcached as 
the MDC, we just reuse it for the FSMDC.

One last note: the original bug report said this:
"(or NFS for cache directory, or anything else that will share the meda-data 
cache between servers)"
We do not (and never have) supported a shared filesystem for the metadata cache 
and this change will NOT handle that use case. We -could- re-introduce the 
directive to enable a FSMDC even when not using memcached but currently we have 
no intention of doing so.

Original comment by matterb...@google.com on 30 Oct 2012 at 12:03