jmdobry / angular-cache

angular-cache is a very useful replacement for the Angular 1 $cacheFactory.
http://jmdobry.github.io/angular-cache
MIT License
1.39k stars 156 forks source link

Add option specifying how frequently to check if items have expired (performance) #28

Closed jmdobry closed 11 years ago

jmdobry commented 11 years ago

In aggressive delete mode, this method eliminates a $timeout for every item in the cache and instead uses setInterval to sweep the cache for expired items. Setting this to 60000 for example, has the cache only checking for expired items once a minute, instead of the browser watching a timeout for every item in the cache.

Is it worth it to look into this?

gosusnp commented 11 years ago

In that case, what do you think of keeping track of the expiration dates in a heap?

jmdobry commented 11 years ago

Can you elaborate?

gosusnp commented 11 years ago

Using a heap, you can keep track of the next expiring item. It is then possible to avoid setting a timeout for every item and use a shorter interval to only check the next item that would expire.

jmdobry commented 11 years ago

My initial thought was to simply iterate over the items in the cache and remove the expired items, but this could potentially result in a lot of wasted computing time. Using a heap would solve the wasted computing time problem, but introduces the overhead of maintaining the heap in sorted order based on expiration date. Like you said, each interval would do something like this:

var i = 0;
while (expirationHeap[0].isExpired) {
  this.remove(expirationHeap[0].key);
  i++;
}

if (i) {
  expirationHeap.splice(0, i);
}

Option 1 is the least completing. It does not require any extra data structures, but may waste CPU time because it's a dumb solution.

Option 2 introduces complexity, but delivers better performance.

Thoughts?

gosusnp commented 11 years ago

I find option 2 to be more elegant.

If the number of the items stored grows, then with Option 1 each check would be more complex. I would say O(n) where n is the number of items. Using Option 2, each check should more efficient since we check at most 1 item that would not be removed. The cost of maintaining the heap would be O(log (n)) on insertion and O(log (n)) on suppression. However, if we want to use aggressive delete mode, we would run O(n) every seconds in Option 1 whereas with Option 2, we would have some O(log (n)) operation when and only when it is needed. Unless the expiration time is very short, Option 2 looks better.

Another option that might be worth considering is what is done in Redis. In general, it is a passive deletion and from time to time (10 times per seconds for Redis), a few keys are randomly checked for deletion. It is easier to implement and it avoids leaking expired cached items for too long.

jmdobry commented 11 years ago

Slating this for the 2.0.0 release. #51