keycdn / cache-enabler

A lightweight caching plugin for WordPress that makes your website faster by generating static HTML files.
https://wordpress.org/plugins/cache-enabler/
123 stars 46 forks source link

Cache Clearing Hooks #290

Open ouun opened 3 years ago

ouun commented 3 years ago

@coreykn I deep dived into the hooks Cache Enabler provides and I am missing an important part: The "system cache cleared hooks" are executed after the respective action has been done. While it makes totally sense that this only happens when cached files exist, there is no action executed if there is no cache available.

I have a usecase where we use an Edge Caching, which I purge via API. In some cases the edge cache is present while the Cache Enabler cache is not. And currently there is no way to send a remote purge request via Cache Enabler as long as no cache file exists, and the cache file won't get generated as the edge cache is already present.

To solve that, always executed actions are required, which are executed when cache clearing is requested. So I would like to ask for your opinion to add e.g the following actions which are executed before Cache Enabler checks for existing files:

That would be so fantastic. I am hapy to open a PR if this is something you agree with.

ouun commented 3 years ago

Another approach is to always execute the already available "system cache cleared hooks" but to add another parameter that passes whether cache files were purged or skipped as none were found.

coreykn commented 3 years ago

I agree. When creating those hooks I had thought of the same thing but I found it to be more complicated than I wanted for pages due to how the cache clearing system works. By that I mean there is not always an initial list of pages to clear. For example, it can be a single URL that may also be directed to include the pages beneath it (e.g. pagination). That means without using too many resources I'm unsure how the before page cleared hook would be fired accurately. I'd love to see or hear more of what you have in mind though.

Just so you know, I do intend on adding more hooks to some of the methods (primarily cache clearing) to allow them to be extended. This is to make it easier to build extensions/add-ons for Cache Enabler in the future. I need to improve the return handling across many methods before this would be reliable, which is now possible with the new cache iterator. That was on the list for one of the next two branches and would allow certain functions to be hooked into the cache clearing methods. It could also be used to do something with what Cache Enabler is going to try and clear. If that doesn't cover what you're thinking and all you want is the URL that is maybe going to be cleared, then maybe firing something in the cache iterator itself would be best.

As for your other page cache layer above Cache Enabler, maybe it's possible to configure it to not cache a response if a certain response header does not exist, such as the X-Cache-Handler.

ouun commented 3 years ago

@coreykn thanks again for your comprehensive answer.

I'd love to see or hear more of what you have in mind though.

The minium requirement is simple: An action hook that is executed each time the cache clearing runs. It must not depend on the local cache, so it needs to be executed each time cache_iterator() runs even dough no local cache file exists.

[...] it can be a single URL that may also be directed to include the pages beneath it (e.g. pagination)

That makes totally sense and yes, running "after" makes much more sense. I just need to be sure that if cache clearing for a specific URL is requested, that the action is executed. Running after makes it also possible to collect all URLs that should get deleted and which could be collected in an array passed to the action. An example:

add_action( 'cache_enabler_after_page_cache_cleared', function( $cleared_urls ) {

    // $cleared_urls contains all urls that should be deleted, for each a bool if a local cache files was deleted
    $cleared_urls = [
        'example.com/page/'                 => true
        'example.com/page/2'               => true
        'example.com/page/3'               => false
        'example.com/blog/a-post'       => true
        'example.com/blog'                   => false
    ];

    // Do whatever with the array example above

});

In my usecase I would use that hook to collect all URLs over a specific period (e.g. 5 minutes) and then send an API request to the Edge Cache Provider to purge the cache for all these URLs.

The other two hooks cache_enabler_after_site_cache_cleared & cache_enabler_after_complete_cache_cleared also need to run regardless of the fact whether the local cache was cached or not, but could pass that as an parameter for flexibility.

As for your other page cache layer above Cache Enabler, maybe it's possible to configure it to not cache a response if a certain response header does not exist, such as the X-Cache-Handler.

Thank you for the hint. Yes that is a great idea but does mot help in my case. I want the Edge Cache to hold a page cache as long as it is not purged via one of the hooks above. I hope that makes all sense for you.

coreykn commented 3 years ago

You're most welcome. The cache_enabler_after_page_cache_cleared hook shown in your example won't work because Cache Enabler only knows it is trying to clear example.com/page/*. It doesn't know if example.com/page/3 exists if it is not cached. That means the $cleared_urls variable can't be created as demonstrated. Unless new behavior is added to collect that information about what pages exist beneath another page, but I do not think it is worth adding that to the core plugin as it'd fit better as an extension.

What I'll be adding here shortly will at least allow all the URLs to be collected that are sent to be cleared, regardless of whether or not they actually get cleared. (That could include wildcard URLs or an argument to clear the subpages.) The new hooks could also allow you to create that customer behavior mentioned above, which could be setup to gather all URLs that may be cleared beneath a subpage (like by polling the sitemap or a custom index file). When combined with what has actually been cleared you will have that $cleared_urls list referenced above.

ouun commented 3 years ago

The cache_enabler_after_page_cache_cleared hook shown in your example won't work because Cache Enabler only knows it is trying to clear example.com/page/*. It doesn't know if example.com/page/3 exists if it is not cached. That means the $cleared_urls variable can't be created as demonstrated. Unless new behavior is added to collect that information about what pages exist beneath another page, but I do not think it is worth adding that to the core plugin as it'd fit better as an extension.

I see, thanks for clarification. I agree that this is better an extension and even also the wildcard example.com/page/*would be enough to work with. So simply:

add_action( 'cache_enabler_after_page_cache_cleared', function( $cleared_url, $cleared ) {

    // $cleared_url contains the (parent) url that should be deleted
    $cleared_url = example.com/page/*;

    // $cleared as second parameter indicates if a cached files got deleted
    if ($cleared) {
        // Do whatever
    }
});

What I'll be adding here shortly will at least allow all the URLs to be collected that are sent to be cleared, regardless of whether or not they actually get cleared. (That could include wildcard URLs or an argument to clear the subpages.) The new hooks could also allow you to create that customer behavior mentioned above, which could be setup to gather all URLs that may be cleared beneath a subpage (like by polling the sitemap or a custom index file). When combined with what has actually been cleared you will have that $cleared_urls list referenced above.

Perfect! That is even more powerful as the solution above. Looking forward to test that, soon. :)

ouun commented 2 years ago

@coreykn I hope you are doing good. Just wondering if you could provide an approximate ETA for the new hook? I know this is specific usecase but currently I am stucked with a project as I need to use that kind of hook for my Edge-Cache-Api integration.

If you give me a hint where you planned to add this, I am also happy to provide a first draft via a PR.

Thanks a lot and kind regards,

Philipp

ouun commented 1 year ago

Hi @svenba, As a year has passed now I would like to ask if you are open for a PR. I would follow the suggestion of @coreykn:

What I'll be adding here shortly will at least allow all the URLs to be collected that are sent to be cleared, regardless of whether or not they actually get cleared. (That could include wildcard URLs or an argument to clear the subpages.) The new hooks could also allow you to create that customer behavior mentioned above, which could be setup to gather all URLs that may be cleared beneath a subpage (like by polling the sitemap or a custom index file). When combined with what has actually been cleared you will have that $cleared_urls list referenced above.

Will be happy to have another look to this as long as you are open for that improvement.

Thanks and kind regards,

Philipp

svenba commented 1 year ago

Hi @ouun PR sounds good. Please make sure it has no potential negative impact in any way.