vivekn / redis-simple-cache

Simple redis based cache for storing results of python function calls, json encoded strings or html.
Other
121 stars 69 forks source link

Ranges of keys #9

Open szaydel opened 11 years ago

szaydel commented 11 years ago

I think it would be nice to have ranges of keys, which are stored as members of a set. One should be able to define a range of any given name, and optionally add keys for any wrapped function to this set. Any number of sets should be possible. This would make it easier to selectively expire groups of keys prematurely, if necessary. I can see a use case for this being where a change made to some data elements will result in no-longer valid cached entries, but their expiration originally set on the wrapped function or functions has not yet been reached.

fjsj commented 11 years ago

Seems like a nice functionality for selective cache invalidation. But instead of the name "range of keys", what do you think about "key namespace"?

Also, how would the developer call the selective expiration? Perhaps a method attached to the decorator? Would require to convert the decorator to a class, I think.

szaydel commented 11 years ago

Key namespace sounds better than range, for sure. I honestly did not think through this, but perhaps adding a class method to the SimpleCache class is a good enough answer.

fjsj commented 11 years ago

The class method would be like this SimpleCache.flush_namespace(namespace)?

szaydel commented 11 years ago

I'd be lying if I said that I am very familiar with decorator methods, but your idea does sound good. And, as for class method I think a descriptive and simple method is great. It should not require any arguments other than the namespace name.

fjsj commented 11 years ago

Sorry, I've edited my last comment because I misread your suggestion. Perhaps a class method is really the best approach. The advantage of a decorator method is avoiding a new import for those using only the decorators. But semantically the class method makes more sense.

Now, how to implement it? Maybe adding a namespace prefix and using KEYS command to get the keys with that prefix for deletion?

szaydel commented 11 years ago

Namespace prefix is one way. What I was thinking is having keys grouped by set, with each set effectively being the namespace name.

Example:

set_super_slow_funcs [keyblah, keyblah, etc.] set_not_super_slow_funcs [keyblah, keyblah, etc.]

Am I making sense?

fjsj commented 11 years ago

Makes sense. But what are the advantages of using a set of keys in this case? I know the SimpleCache class uses a set, but it is for enforcing the key limit (the SCARD operation is O(1), much faster than the return of a KEYS operation).

szaydel commented 11 years ago

I think, to me a set is a more natural way of doing this, after all that is the point of a set, grouping of unique entries, no? In reality I am sure that it is not necessarily the most efficient way of doing things. Though, I think I would choose elegance over efficiency here.

fjsj commented 11 years ago

While I agree on theory that sets are a more elegant approach than a prefix, they would require a SADD and a SET operation at cache insert. On cache invalidation, a SMEMBERS, many DEL (for each member) and another DEL (for the set) would be required. A prefix-only implementation would require only a SET operation on insert. On invalidation, it would require a KEYS and many DEL (for each member). So in practice (implementation) a prefix-based approach requires less code and is also more efficient (in running time and memory). My preference, for now, is for the prefix way.

I would like to hear @vivekn thoughts on this.

szaydel commented 11 years ago

I think your approach is more efficient, and mine is more elegant. Efficiency tends to win out. Your approach is I think elegant enough to be the best answer here.

fjsj commented 11 years ago

I was trying to implement the flush_namespace(namespace) classmethod, but found a problem. The connection info (host, port and db attributes) are specific of instances, and we need to create a connection to execute deletion commands. Any suggestions?

szaydel commented 11 years ago

The least elegant answer is to pass all of this information to the classmethod. The better answer maybe to track all connections in some index and pass a connection to the classmethod?

vivekn commented 11 years ago

While this is a feature that would be great to have, the implementation might be tricky and I'm not sure if I follow the discussion completely. I have a few questions regarding this:

  1. How would the namespace be attached? Would it be a parameter to the decorator? How can we attach the namespace selectively?
  2. Does the user manually flush the namespace? Or is a dirty flag set on the keys that need to be updated?

These are some issues that need to be worked out.

As for the connection problem, implementing it as a normal method instead of classmethod might solve the problem for time being. The way things are now, the user would have to create a SimpleCache instance explicitly anyway, if he/she wants to change the connection settings.

fjsj commented 11 years ago

It is confusing, because we need a connection to flush, but it makes more sense to put the flush_namespace method at class level, since it isn't specific of a single instance. It will flush data from many instances.

If we store the created connections in a "static" (class level) variable, we need to guarantee that the flush_namespace method is called after at least one SimpleCache instantiation. And we would need to map namespaces to connections (with a dict, probably). So we would need to raise an exception or log and error if flush_namespace is called before a SimpleCache instantiation.

For the sake of simplicity and clean code, I think it is better to discard the idea of a classmethod. I can't figure out how to implement it easily.

Also, we need to think about possible race conditions. Perhaps the use of pipes already addresses that, I'm not sure.

Answering your questions: 1 - As a parameter to the decorator or SimpleCache constructor. I prefer as a constructor param, and since we can pass a SimpleCache instance to the decorators, it will work for both the OO and functional approach. 2 - Manual flush by calling a method flush_namespace. The method should delete all keys of the namespace.

2013/2/22 Vivek Narayanan notifications@github.com

While this is a feature that would be great to have, the implementation might be tricky and I'm not sure if I follow the discussion completely. I have a few questions regarding this:

  1. How would the namespace be attached? Would it be a parameter to the decorator? How can we attach the namespace selectively?
  2. Does the user manually flush the namespace? Or is a dirty flag set on the keys that need to be updated?

These are some issues that need to be worked out.

As for the connection problem, implementing it as a normal method instead of classmethod might solve the problem for time being. The way things are now, the user would have to create a SimpleCache instance explicitly anyway, if he/she wants to change the connection settings.

— Reply to this email directly or view it on GitHubhttps://github.com/vivekn/redis-simple-cache/issues/9#issuecomment-13938609.

djgilcrease commented 10 years ago

Added a possible solution that worked for my needs in pull request #23

I had several groups of related cache items cached via the cache_it decorator, that would all need to be expired on certain conditions and I did not want to expire the whole set so I added a namespace argument to cache_it, and added a flush_namespace & expire_namespace