Closed javisantana closed 9 years ago
I don't think it should be removed. Should not be relyed upon, maybe, but why remove it ? At least in emergency cases you have a way to control the renderer cache. There could be legit cases for that. Even changes in the renderer which are not under control of the tiler, like a Mapnik version upgrade, or a "manual" modification of the XML styles stored in redis (think batch-updates).
I'd tag as "invalid" and close. Of course we still want to fix the partial cache cleanup on style change...
ok, if you want to provide a way to invalidate caches, do it with an endpoint and document it but it should never be a user task to remove the render cache. The user should never know about the cache internals
True, the user should not need to know. Actually, I guess a cache flusher should also be protected by authentication...
So back to this issue. The problem with cache-buster just showed up recently when an high load application suffered by the misues of cache_buster, resulting in caches never being in effect.
Dropping the cache_buster paremeter would surely help this case, BUT...
Do we still have a need for invalidating a cache OR requesting a different cache ? The difference between "invalidating" and "requesting new" is that the former also affects other users while the latter only affects YOU (and the server that needs to hold all those caches in memory)
Remember: the former case for use of cache_buster was that the renderer kept returning tiles from before the data changed. We do have a database trigger that currently invalidates Varnish so we could as well have one that invalidates the Windshaft cache... Still, varnish is smarter with the concept of "cache channel" in that (I think) it lists the set of tables from which data arrives. This is not encoded in the tiler-internal caches instead.
Since Wdinsfhat-cartodb ist he one that determines the cache channel (the set of tables) I guess we could as well encode it into the tiler-internal cache key and expose a function to clear caches by channel (ie: these tables changed...)
Still, let's remember that it's not always possible to determine which tables are affected by a query until the query is run (and the same query may or may not access data from a table based on some condition within some function being invoked). This is to say that sometimes you want to let users provide their own set of tables which should invalidate a cache when changed (as we specified for the new mutilayer API).
dont mix things here, cache_buster is being used to invalidate the CDN and it is used to invalidate the internal cache for the tiler too: I think we should change the cache_buster param name to invalidate the CDN. This is a different issue and it is a cartodb.js problem (i will report later). Luckily this is pretty easy to fix.
so lets talk about the cache_buster related to windshaft itself, i have a question: Is the rendering cache affected by the data itself ? I though it was only affected by cartocss changes
On Wed, Jan 23, 2013 at 02:59:40AM -0800, javi santana wrote:
dont mix things here, cache_buster is being used to invalidate the CDN and it is used to invalidate the internal cache for the tiler too: I think we should change the cache_buster param name to invalidate the CDN. This is a different issue and it is a cartodb.js problem (i will report later). Luckily this is pretty easy to fix.
so lets talk about the cache_buster related to windshaft itself, i have a question: Is the rendering cache affected by the data itself ? I though it was only affected by cartocss changes
The rendering cache key contains SQL and CartoCSS and is additionally associated with a single table.
If CartoCSS or SQL changes, you'd be accessing a different cache item.
Each cache item has a time to live of 1 minute since last access. The tiler checks every minute for expired items.
The event that DROP cache items, other than expiration, are:
Note that the cache_buster value is not associated with a table but with the whole cache manager.
There is no event triggered by data change that affects tiler cache.
my concrete question is: if the table data changes (sql and cartocss are the same), that cache should be cleared?
On Wed, Jan 23, 2013 at 03:14:58AM -0800, javi santana wrote:
my concrete question is: if the table data changes (sql and cartocss are the same), that cache should be cleared?
No, it will NOT be cleared.
but should be?
well, given the current effort to clean the varnish one I think it could be, as it has the same scope. Unless we also drop the attempt to clean the varnish cache.
forget varnish or other system parts.
that cache needs to be cleared when the data is changed? (yes/no)
I don't think Windshaft should drive cache invalidation autonomously. It's still up to some external part of the architecture to decide when to clean what.
Delegating cache invalidation to an external agent allows to centralize "data change" event detection
An extreme example: data doesn't change, cartocss doesn't change, but CartoCSS references an external resource that did change. The client knows about that and wants to clear the relevant cache.
"that cache needs to be cleared when the data is changed?" YES
"that cache needs to be cleared when the data is changed?" YES
ok, so we have a bug here, right?
- Changing the style of a table drop all caches related to that table (POST STYLE / DELETE STYLE)
this has been added recently right? this is why cache_buster exists to remove these caches when style changes
For me cache_buster should be removed and the cache clear should be managed internally using postgres triggers or other technique. We could additionally add an endpoint to clear cache to invalidate external resources as you say
On Wed, Jan 23, 2013 at 03:50:26AM -0800, javi santana wrote:
"that cache needs to be cleared when the data is changed?" YES
ok, so we have a bug here, right?
We do have a bug somewhere. Not necessarely "here" because it is not clear by whom should the cache be managed.
As you don't have a bug in varnish if varnish doesn't clear it caches (or cloudfront..)
- Changing the style of a table drop all caches related to that table (POST STYLE / DELETE STYLE)
this has been added recently right? this is why cache_buster exists to remove these caches when style changes
I think that's been there since I started looking at the code. Actually before I changed it to store the last cache_buster value, the sole presence of that parameter was enough to trigger clearing all table caches.
The clearance isn't performed through cache_buster but internally (whenever the style of a table is successfully changed/dropped).
For me cache_buster should be removed and the cache clear should be managed internally using postgres triggers or other technique. We could additionally add an endpoint to clear cache to invalidate external resources as you say
For sure a good first step could be to restrict the effects of a cache_buster because the current model also shares the value across multiple databases on the same machine. The "cache_buster" value could be stored within the key for the cache item, which could be made invalid (dropped) by requesting a different value.
Handling data invalidation "internally" (within Windshaft) would introduce a change in architecture as currently theres NO direct interaction between the database and Windshaft (only intermediated by Mapnik), and could be a significant performance hit (the tiler should ensure existance of triggers placed on all tables affected by a query).
Also, if we agree on reducing cache lifetime to few seconds, is big change for handling invalidation worth the trouble ? We could have a few seconds cache which is never invalidated...
Ref: #35
--strk;
We need to implement cache clearing on /flush_cache before we can drop cache_buster here. See https://github.com/Vizzuality/Windshaft-cartodb/issues/73
The idea would then be that caches are cleared by the server on "data change" event, rather than by the client.
How to associate a cache to a list of tables is yet to be defined. We know how to do that from Windshaft-cartodb but not from Windshaft itself. Accepting an explicit list of "affected tables" could be an idea. Then Windshat-CartoDB would provide that list by querying the database, overriding any user-provided value.
As per version 1.0.0 cache buster management will be a MapConfigProvider responsibility, so this is no longer valid.
this is a hack and it should be removed