putyourlightson / craft-blitz

Intelligent static page caching for creating lightning-fast sites with Craft CMS.
https://putyourlightson.com/plugins/blitz
Other
149 stars 37 forks source link

Refreshing cache on large and complex site #689

Closed deeekay closed 1 month ago

deeekay commented 1 month ago

Support Request

Dear Ben

We have a very extensive Craft project that includes 50+ sites with many pages and entries. Many entries are linked to each other, and some entries are read on multiple pages (and for each site). There are many authors in the project and new changes come in on a daily basis.

To increase the overall responsiveness of the website, we decided to introduce Blitz caching (as we did on a number of smaller and less complex projects with great success).

For this, we have created a rough caching concept that provides a TTL of 24 hours for most pages. A few entries have a TTL of 60 minutes. Our idea is to initially generate the cache and then update the caches whose TTL has expired every 1.5 hours using the command "blitz/cache/refresh-expired" via cronjob on the server. We have implemented this concept and tested it in the production environment.

We first stuck with "Clear the cache and regenerate in a queue job" to regenerate the cache immediately, which quickly resulted in having 100'000+ jobs in the queue manager. We then went to "Expire the cache, regenerate manually or organically" (https://putyourlightson.com/plugins/blitz#expire-the-cache-regenerate-manually-or-organically). That alleviated some of the issues. However, we quickly realized that whole site (all pages of all websites) were subject to be refreshed, which seemed to eager. During development, we experimented with "trackElements" and "trackElementQueries" and deactivated them to try and alleviate some of the strain.

We have considered the Common Issues page (https://putyourlightson.com/plugins/blitz#common-issues) and also read about queue runners in another article of yours: https://putyourlightson.com/articles/queue-runners-and-custom-queues-in-craft-cms

However, we still don't seem to find a satisfactory solution to our challenge. Given these issues, we are seeking your advice on how to optimize our caching strategy. Specifically, we are looking for ways to reduce the number of jobs in the queue manager and improve overall performance without compromising (too much) on the freshness of our content.

Do you have any suggestions or ideas on how we could implement a suitable caching concept for such a large and dynamic site? Any insights or concrete ideas on our setup would be greatly appreciated.

Best regards, Daniel

Plugin Version

4.21.0

bencroker commented 1 month ago

If your site is very large then you may want to avoid regenerating the cache automatically, as that can result in long running queue jobs. The simpler approach will be to use the “Clear the cache and regenerate manually or organically“, but it depends on various things, many of which I discussed in this livestream: https://craftquest.io/livestreams/caching-strategies-with-blitz

Hopefully that helps and feel free to reach out via email if you‘d like a consultation to come up with the best strategy for your specific site together.

deeekay commented 1 month ago

Hi Ben and thanks for the prompt reply.

We are currently thinking about a "stale-while revalidating" approach where we don't use server rewrites and settle with having stale page delivery for a short amount of time. In addition, we would plan to run a regular job refresh expired content in the background, that way trying to minimize the number of times stale content is actually delivered. Is that approach something that sounds familiar and reasonable to you?

Best regards, Daniel

bencroker commented 1 month ago

Yes that sounds like a good approach and is something I discuss in the livestream. Setting the refresh mode to “Expire the cache and regenerate manually” and running the blitz/cache/refresh-expired command hourly via a cron job should work well. The other important piece of making this work smoothly is optimising the templates using eager-loading in element queries, avoiding the use of globals and, if possible, using server-side includes. Let me know how it goes.

deeekay commented 1 month ago

It works in general (also eager loading resulted in some minor performance improvements when generating/refreshing the cache). There is, however, still a lot of refreshing going on. I have the feeling that when a complex page type is re-used amongst many sites and updates are made to a page on one site, the pages gets refreshed on all sites even if the actual change only has an impact on some of the sites. Is this assumption correct?

bencroker commented 1 month ago

Yes, if an entry is used by many sites then Blitz will need to refresh each of pages on each of the sites that output that entry. Blitz does its best to be smart about this by tracking which custom fields are output on each page, but depending on how your templates are set up, it may be less than optimal. The Blitz Diagnostics utility can help you better understand how your site’s cached content is structured, allowing you to optimise the caching strategy and overall performance.

deeekay commented 1 month ago

Thanks for your help. In this context - and what I never completely understood: could we achieve a positive impact on that matter by only tracking elements but not queries therefore refreshing less pages or is that assumption false and tracking elements without tracking queries (and vice versa) does not really make sense at all?

bencroker commented 1 month ago

Tracking element queries is necessary if you want Blitz to refresh pages that list elements fetched via an element query when, for example, a new element is added (and the cached page should therefore be refreshed). My general advice on tracked element queries is to reduce them as much as possible. It depends on the site, of course, but to give you a sense of what’s possible, the entire putyourlightson.com site tracks only 9 element queries.