bnomei / kirby3-lapse

Cache any data until set expiration time
https://forum.getkirby.com/t/kirby3-lapse-cache-any-data-until-set-expiration-time-with-automatic-keys/23586
MIT License
20 stars 0 forks source link

Flood Protection #15

Closed marcus-at-localhost closed 2 years ago

marcus-at-localhost commented 2 years ago

There is no way in preventing flooding the cache if setup wrong, right? This is a dumb example, but should get across the point

$key = crc32($feedurl.uniqid());

This would fill the cache with each request. And there is not really a way to flush the cache for that specific "domain", once the $key is not known anymore

Here is another example that illustrates it better:

// there are better ways to check if remote content changed and it's questionable if HEAD is really faster, than a single full GET
$request = Remote::request($feedurl, ['method' => 'HEAD']);
// get content length to determent if content changed
$content_length = A::get($request->headers(), 'content-length');

$pages = lapse(crc32($feedurl . $content_length), function () use ($results, $pages, $feedurl) {
      $request = Remote::get($feedurl);
      $results = Xml::parse($request->content());
      if (count($results) > 0) {
            foreach ($results as $item) {
                $item = $item->toArray();
                $pages[] = [];
            }
      }
      return $pages;
});
Pages::factory($pages, $this);

Now if I want to clear the cache, I have to delete and rebuild the complete cache,

I've seen caching mechanism that implemented caching groups, in order to have granular control over what to flush.

In my example, I could have a cache group by $feedurl and single entries of content-length or modifiedDate whatever. And then be able to say - flush all cache entries of group $feedurl

here is some pseudocode:

lapse([crc32($feedurl), $content_length], $data)
// flush all with key crc32($feedurl);
$wasRemoved = \Bnomei\Lapse::rm([crc32($feedurl)]);

Would it make sense to implement it that way?

S1SYPHOS commented 2 years ago

Would it be sufficient to list / filter cached entries, select the ones matching some URL / string and flush those?

bnomei commented 2 years ago

there is an (yet) undocumented bnomei.lapse.indexLimit options. defaults to null but you can set an integer. the cache will not store more entries than that number then by FIFO. also using lapse with caches like apcu enforce a total entry number by imposing a total memory limit.

deleting via groups would mean adding that information somewhere and i would like to avoid that overhead and keep it as simple and fast as possible. i do like the idea though. thanks for your feedback.