Open blogtutor opened 1 year ago
I should also add a little background:
Any time content is published or updated, we trigger a full CF cache clear. We could be more granular, clearing only specific pages where that content might appear (homepage, related category pages, the next/prev page, author archives) -- but that gets a little complicated and may not be worthwhile. (We also use WP Rocket on most sites, which clears in this manner... so other content should still be cached with Rocket, mitigating the impact of the full CF flush.)
Any time a comment is published/unpublished, we clear the CF cache for just that related post.
I should also add a little background:
Any time content is published or updated, we trigger a full CF cache clear. We could be more granular, clearing only specific pages where that content might appear (homepage, related category pages, the next/prev page, author archives) -- but that gets a little complicated and may not be worthwhile. (We also use WP Rocket on most sites, which clears in this manner... so other content should still be cached with Rocket, mitigating the impact of the full CF flush.)
Any time a comment is published/unpublished, we clear the CF cache for just that related post.
Yes this definitely gets tricky the more we think about it e.g. if there is a widget of sorts on N
pages that includes links to other pages (i.e. related posts) we wouldn't be able to know where all of that was happening and would likely end up with a (near?) full cache purge anyway. Full cache purges kinda feels like a sledgehammer solution, but I don't see one any better at the moment.
That said, a debounce as described above by @blogtutor sounds like the best next step:
- Content is updated or published.
- If a full cache clear hasn't been triggered in the last 60 seconds, clear the cache immediately.
- If the full cache has been cleared in the last 60 seconds, schedule a cron to clear the cache 60 seconds from the last clear (or 60 seconds from the update).
- If we do this, we basically guarantee that the cache gets cleared right away if possible, and if not possible, it still gets cleared. And it won't ever clear more frequently than every 60 seconds.
I can work on a branch to explore this if you'd like?
Yes, please. 🙏
Just FYI, right now we're purging the cache by "Prefix"... so it's basically https://domain.com/(.*)
for a full clear, or https://domain.com/post-name/(.*)
for a single post clear (for addressing comments). For the latter, the ensures that any paginated comment pages are also flushed.
https://developers.cloudflare.com/api/operations/zone-purge
As you dig in, please take a look at the logic for comments. Right now it will clear the cache for that individual URL, but there's no rate limiting. We've seen some issues when people do lots of bulk comment changes (like marking 10+ comments as spam at once), so maybe we should address that at the same time?
Also please think about what "overrides" we might want to be able enable quickly on sites with extremely challenging circumstances -- in cases where it's more important to cache aggressively than to refresh content quite as quickly.
This is going to be useful on sites where there's insane traffic and they're rapidly publishing or editing content (and/or lots of comments are coming in).
Idea: Have a default of 60
seconds (or whatever) for the full cache clear rate limit, but then have an option in our plugin's settings to override that.
Idea: Have a toggle so that post/page updates don't triggering a full purge but instead purge just that URL (and maybe the homepage)?
(Sidenote, if we want to clear only the homepage we can't use Prefix
clearing.)
For your review: https://github.com/blogtutor/blog-tutor-support/tree/268-cloudflare-cache-purge-debounce
In an effort to begin with baby steps, this branch implements a simple debouncing pattern to ensure that individual cache clearing calls happen at a minimum of 60s
apart. Here's the heavy lifting for that:
Note that calls to purge_cloudflare_cache
have been updated to call this method instead, routing those calls through this debouncing method.
Also note that the $prefixes
act as identifiers for each call purge_cloudflare_cache
i.e. we are treating calls to purge_cloudflare_cache
that have different $prefixes
as separate calls entirely meaning that we will be implementing n
debouncing routines, each identified by the $prefixes
.
Let's step through how that works:
First we receive the call to clear the cache, accepting $prefixes
just as we do in purge_cloudflare_cache
We then set up our variables:
By serializing and then hashing the $prefixes
we can generate a unique ID for this call, allowing us to have separate debounce handling for each call to clear the cache. This feels like a healthy balance between achieving the goal of throttling how many calls there are to purge the cache, but also allowing a degree of aggressiveness to it.
We check for a transient that has been stored (and only lives for $debounce_threshold
seconds) and if there is no transient it means that we do not need to debounce and we proceed to pass along the $prefixes
to purge_cloudflare_cache
directly.
We then store a transient which contains a timestamp for the next time we want to allow purge_cloudflare_cache
to run with these $prefixes
, along with the $prefixes
themselves:
If, on the other hand, the transient does exist, we need to retrieve our execution timestamp and the $prefixes
needed for the call. We then schedule a WP_Cron
event to execute at our desired timestamp:
Lastly, we implement the hook for that WP_Cron
event when it fires:
Which in turn has its own callback that relays $prefixes
to purge_cloudflare_cache
:
Note: the argument retrieval/passing is built into WP_Cron
and we're taking advantage of it here which is why $prefixes
exists, we passed it in here:
and WP_Cron
considers passed arguments when handling its events. A bit of magic that we're taking full advantage of here. 👍
This is the most simple approach I could think of to begin exploring this idea of debouncing the cache clears. What do you guys think @ecotechie and @blogtutor? I haven't actually tested this code because I don't have the infrastructure set up to do so, but as we wade into these waters I figured that the smallest steps to take at first would be best!
All this logic sounds great. I just tried installing it on dev.blogtutor.com and hit a snag, though. I tried doing a full cache clear and that worked fine. The second one, a few seconds later, got a 500
response from admin-ajax.php
.
Found this in debug.log:
[15-May-2023 22:42:50 UTC] PHP Fatal error: Uncaught Error: Using $this when not in object context in /redacted/public_html/wp-content/plugins/blog-tutor-support-268-cloudflare-cache-purge-debounce/includes/class-support-cloudflare.php:373
Stack trace:
#0 /redacted/public_html/wp-content/plugins/blog-tutor-support-268-cloudflare-cache-purge-debounce/includes/class-support-cloudflare.php(497): NerdPress_Cloudflare_Client::debounce_purge_cloudflare_cache()
#1 /redacted/public_html/wp-includes/class-wp-hook.php(308): NerdPress_Cloudflare_Client::purge_cloudflare_full()
#2 /redacted/public_html/wp-includes/class-wp-hook.php(332): WP_Hook->apply_filters()
#3 /redacted/public_html/wp-includes/plugin.php(517): WP_Hook->do_action()
#4 /redacted/public_html/wp-admin/admin-ajax.php(188): do_action()
#5 {main}
thrown in /redacted/public_html/wp-content/plugins/blog-tutor-support-268-cloudflare-cache-purge-debounce/includes/class-support-cloudflare.php on line 373
I also activated the "Debug Bar" plugin and it showed me this notice:
NOTICE: wp-content/plugins/blog-tutor-support-268-cloudflare-cache-purge-debounce/includes/admin-menu.php:204 - Undefined index: page
require_once('wp-admin/admin-header.php'), do_action('in_admin_header'), WP_Hook->do_action, WP_Hook->apply_filters, wp_admin_bar_render, do_action_ref_array('admin_bar_menu'), WP_Hook->do_action, WP_Hook->apply_filters, bt_custom_toolbar_links
One other thing to consider: We've had instances where there are two cache clear calls back-to-back (effectively at the same time). Most notably it was happening in two situations:
1) When hitting "Update" in the block editor; sometimes it would just trigger multiple cache clears, still don't know why (never happened in Classic)
2) When someone is approving (or unapproving) multiple comments on the same post using the bulk edit tool.
In those cases, it actually has been better to just hit the API with the request several times in rapid succession; it would be less performant (and unnecessary) to hit it once, and then hit it again 60 seconds later... So we should see if we can account for that somehow?
Whoops, sorry about that, fixed the context usage in https://github.com/blogtutor/blog-tutor-support/commit/79acd346ec08e3d0e11833c2467b7a593d08596a
I'll noodle on https://github.com/blogtutor/blog-tutor-support/issues/268#issuecomment-1548716968 a bit to see how we can elaborate on the conditions for the debouncing.
On extremely high traffic sites, particularly ones with multiple authors, rapid Cloudflare cache clears can cause issues. I'd like to find a graceful way to limit cache clearing so that servers don't get overwhelmed, but content still gets refreshed on the front end in a timely manner.
Right now we have a rudimentary procedure that will prevent a "Full" cache clear from happening more often than every 10 seconds.
https://github.com/blogtutor/blog-tutor-support/blob/755a3067f2b16f2d3dd94bde128da5905a5341ff/includes/class-support-cloudflare.php#L356-L365
Over the weekend, I had to increase that to 60 seconds on one site (during the NFL draft). I did this by editing the file directly on the server (slack thread here).
Here's what I'm thinking we could do instead, though I'm open to any smarter ideas!:
We may also want to add a setting in the plugin to be able to override the default. (Also open to a different default than 60 seconds!)
Thoughts?