Closed GoogleCodeExporter closed 9 years ago
I discussed with Josh and Maks offline and we came up with several possible
solutions to this:
1) Stop checking for exact mtime match, instead check that the file is not
newer than the cached mtime. Thus if we had N servers, we would rewrite the
file at most N times and on average N/2 times. Downside: If you have serious
server time skew or try to revert a file to an older timestamp, we may serve
stale content.
2) Configure a range around stored mtime that should be considered equal. Say
+/- 1 minute, this would account for slight differences from clock skew or
replication time. Thus the resource would only be rewritten once ideally.
Downside: You'd have to configure this to fit your situation, how long your
replication takes / how bad your clock skew is. Otherwise your system would
thrash just like now.
3) Store list of (machine_identifier, mtime) pairs in cache so that the mtimes
are only interpreted by the machine that wrote them. The resource would be
rewritten by each server (We could probably optimize this to rewritten once).
Downside: We need to store a bunch more server-specific info in the cache,
cache update could have weird race-conditions, especially under high load that
would lead to the resource being rewritten any number of times.
4) Stop storing mtime in the cache. Instead store input hash in the cache (or
something similarly identifying) and store the mtime in a separate local server
cache. This would keep the speed of checking a file to a single stat of that
file and a local meta-data cache lookup, while also allowing us to share the
results of a rewrite between servers and we wouldn't need to store a bunch of
server specific stuff in memcached. Downside: We need to add a separate local
server cache.
5) Disallow memcached and LoadFromFile together. Just don't let people do this.
Downside: We are advocating the use of both memcached and LoadFromFile,
annoying that you couldn't use them together.
Original comment by sligocki@google.com
on 29 Aug 2012 at 2:13
As a point of user feedback, I personally would rather deal with the
consequences of #1 or #2 than #5. At least those two are something I can
control and mitigate on our end as long as I know what's going on.
Additionally, as more servers are added to the pool, I think it becomes more
reasonable to expect (and document) the need for files and server times to be
in sync across the pool.
#3 and #4 sound great too of course, but it also sounds like significantly more
work on your end.
Original comment by amat...@gmail.com
on 29 Aug 2012 at 4:27
A note on my particular use case
- servers are all on NTP UTC time
- standard deployment process distributes a zip file containing site files
across current active servers, which each independently unpack the files. the
contents are the same but mtime will have some spread +/- 1 minute
- using autoscaling on amazon which means that servers will come and go over
the course of hours and days. when they initialize they retrieve the zip
package with the files for the site. contents will be the same as on the other
servers but will have a much newer mtime.
for my use case,
1) would solve the initial deployment and work for new servers, but we'd lose
the big advantage of memcache because new servers need to rewrite everything
when they come online
2) would solve the initial deployment, but would still have a thrash situation
when new servers come on
3) 'weird race conditions' sounds unpleasant
4) would be ideal, but sounds like a lot of work
5) would be the gordian knot approach, but would be a drag
Original comment by jon.mars...@englishcentral.com
on 29 Aug 2012 at 4:59
What are you using to deploy the .zip file? zip/unzip should normally preserve
timestamps (with one second accuracy, which isn't good enough for current code,
but is workable).
Original comment by morlov...@google.com
on 29 Aug 2012 at 5:12
We're using a fairly esoteric format called ZPK which is used by Zend Server,
an enhanced and proprietary Linux-Apache-PHP-MySQL server.
http://www.zend.com/en/products/server/getting-started/application-deployment
It's essentially a zip file with some special manifest contents. The guts of
the deployment process appear to unpack the zip to a /tmp folder, then copy the
unpacked files to the deployment location. The mtimes I have appear to match
when the actual copy happened on the web server.
Hash contents should be identical, but mtimes vary.
Original comment by jon.mars...@englishcentral.com
on 29 Aug 2012 at 5:24
Original comment by jmara...@google.com
on 4 Sep 2012 at 6:33
Another possible hybrid solution:
6) Store an mtime and a content hash in the cache. If mtime is newer than
ours, use optimized data. If ours is newer, compute the content hash and
either update the mtime, or re-optimize if the hash doesn't match. Haven't
thought about how this interacts with locking etc. though as we usually lock
the resource for re-optimization before fetching it. (Actually, I'm not sure
if we do cross-machine locking for memcached deployments, or just have every
cache miss potentially result in a local re-optimization; I think we may
presently do the latter, which might increase startup load but save a vast
amount of code complexity / fault tolerance.)
Original comment by jmaes...@google.com
on 4 Sep 2012 at 8:08
Original comment by jmara...@google.com
on 5 Sep 2012 at 1:31
Original comment by jmara...@google.com
on 6 Sep 2012 at 7:55
The fix is in:
You can specify a server-private "filesystem metadata cache" that stores the
server's timestamp rather than it being stored in the metadata cache.
Each time the server wants the file, it stat()s it to get its current mtime and
checks that against the value stored in its filesystem metadata cache.
If they're the same, it gets its idea of the content hash from the metadata
cache then compares that to the content hash in the metadata cache.
If they're the same, then the metadata cache entry for the file is current and
can be used, as can the rewritten contents in the HTTP cache.
If either of the values are different or missing, the file is re-read, its
content hash is recomputed, the filesystem metadata cache is updated with the
new values, and the checks are performed again.
If after all this the 2 content hashes are still different/missing, the
contents of the metadata cache and HTTP cache are out-of-date and a rewrite is
initiated (which ultimately results in the metadata and HTTP caches being
updated with the latest contents).
If a different server with different contents for the file then goes through
these steps, the "new" values will be overwritten again by the "old" values,
but this is unavoidable and will stop once the file contents are the same.
Full documentation is coming but the new directive is:
ModPagespeedFilesystemMetadataCache value
where value is either a memcached server on localhost, or the literal value
'memcached' which will use the first memcached server in
ModPagespeedMemcachedServers on localhost.
For example:
ModPagespeedMemcachedServers memcachedserver1:6765,memcachedserver2:6765
ModPagespeedFilesystemMetadataCache localhost:6765
or
ModPagespeedMemcachedServers memcachedserver1:6765,localhost:6765
ModPagespeedFilesystemMetadataCache memcached
This fix does NOT currently handle the case where your metadata cache is on a
shared disk (NFS/NAS), but I believe there's no technical reason it won't work,
it just hasn't been tested yet.
Original comment by matterb...@google.com
on 25 Oct 2012 at 2:37
Correction: the memcached value can only be used if ALL the servers in
ModPagespeedMemcachedServers are on localhost.
HOWEVER, DO NOT USE THIS VALUE (memcached) as it will removed shortly because
it actually results in the original broken behavior (I think).
Original comment by matterb...@google.com
on 25 Oct 2012 at 5:45
Original comment by j...@google.com
on 26 Oct 2012 at 3:02
Yet another update, hopefully good news.
I recently submitted a mod to this change that removed the new directive and
automatically configures the filesystem metadata cache if/as required.
This is possible because we now reuse the metadata cache (MDC) for the
filesystem metadata cache (FSMDC) but we prefix entries in the FSMDC with the
server's hostname, meaning that multiple servers can share the cache without
stomping on each other. Since the FSMDC is only needed when using memcached as
the MDC, we just reuse it for the FSMDC.
One last note: the original bug report said this:
"(or NFS for cache directory, or anything else that will share the meda-data
cache between servers)"
We do not (and never have) supported a shared filesystem for the metadata cache
and this change will NOT handle that use case. We -could- re-introduce the
directive to enable a FSMDC even when not using memcached but currently we have
no intention of doing so.
Original comment by matterb...@google.com
on 30 Oct 2012 at 12:03
Original issue reported on code.google.com by
sligocki@google.com
on 29 Aug 2012 at 1:25