putyourlightson / craft-blitz

Intelligent static page caching for creating lightning-fast sites with Craft CMS.
https://putyourlightson.com/plugins/blitz
Other
149 stars 36 forks source link

Missing ID triggers 2K+ in 24h #674

Closed jan-dh closed 4 months ago

jan-dh commented 5 months ago

Bug Report

We have a setup with Blitz + Craft 4 + Clouddfront which means we pay for invalidation requests. The last 24h we've had 2,534 invalidation requests for an ID that doesn't seem to exist in the DB and has no elements attached to it: Blitz cache refresh job added : {"cacheIds":{"5585":true},"elements":[]}

We have a DB entry for 5584and 5586 but not for 5585.

Diagnostics Report

Application Info

Installed Plugins

Loaded Modules

Blitz Plugin Settings

{
    "debug": true,
    "hintsEnabled": true,
    "cachingEnabled": true,
    "refreshCacheEnabled": true,
    "refreshMode": 2,
    "includedUriPatterns": [
        {
            "siteId": "",
            "uriPattern": ".*"
        }
    ],
    "cacheStorageType": "putyourlightson\\blitz\\drivers\\storage\\YiiCacheStorage",
    "cacheStorageSettings": [],
    "cacheStorageTypes": [],
    "cacheGeneratorType": "putyourlightson\\blitz\\drivers\\generators\\HttpGenerator",
    "cacheGeneratorSettings": {
        "concurrency": "3"
    },
    "cacheGeneratorTypes": [],
    "customSiteUris": [],
    "cachePurgerType": "putyourlightson\\blitzcloudfront\\CloudFrontPurger",
    "cachePurgerSettings": {
        "apiKey": "********************",
        "apiSecret": "****************************************",
        "distributionId": "E35TL05JGJ1ANJ",
        "condenseUrls": false
    },
    "cachePurgerTypes": [],
    "deployerType": "putyourlightson\\blitz\\drivers\\deployers\\DummyDeployer",
    "deployerSettings": [],
    "deployerTypes": [],
    "ssiEnabled": true,
    "ssiTagFormat": "[an error occurred while processing the directive]",
    "detectSsiEnabled": true,
    "esiEnabled": false,
    "queryStringCaching": 2,
    "includedQueryStringParams": [
        {
            "siteId": "",
            "queryStringParam": ".*"
        }
    ],
    "excludedQueryStringParams": [
        {
            "siteId": "",
            "queryStringParam": "gclid"
        },
        {
            "siteId": "",
            "queryStringParam": "fbclid"
        }
    ],
    "apiKey": "",
    "generatePagesWithQueryStringParams": true,
    "purgeAssetImagesWhenChanged": true,
    "refreshCacheAutomaticallyForGlobals": false,
    "refreshCacheWhenElementMovedInStructure": true,
    "refreshCacheWhenElementSavedUnchanged": false,
    "refreshCacheWhenElementSavedNotLive": false,
    "cacheNonHtmlResponses": false,
    "trackElements": true,
    "trackElementQueries": true,
    "excludedTrackedElementQueryParams": [],
    "cacheElements": true,
    "cacheElementQueries": true,
    "cacheDuration": null,
    "nonCacheableElementTypes": [],
    "sourceIdAttributes": [],
    "liveStatuses": [],
    "integrations": [
        "putyourlightson\\blitz\\drivers\\integrations\\FeedMeIntegration",
        "putyourlightson\\blitz\\drivers\\integrations\\SeomaticIntegration"
    ],
    "defaultCacheControlHeader": "no-cache, no-store, must-revalidate",
    "cacheControlHeader": "public, s-maxage=2592000, max-age=2592000",
    "cacheControlHeaderExpired": "public, s-maxage=5, max-age=0",
    "sendPoweredByHeader": true,
    "outputComments": true,
    "refreshCacheJobPriority": 10,
    "driverJobBatchSize": 100,
    "driverJobPriority": 100,
    "queueJobTtr": 3600,
    "maxRetryAttempts": 2,
    "maxUriLength": 255,
    "mutexTimeout": 30,
    "commands": [],
    "injectScriptEvent": "DOMContentLoaded",
    "injectScriptPosition": 3
}

Recommendations

Site Tracking [#1]

Site Tracking [#2]

Site Tracking [#3]

Site Tracking [#4]

Site Tracking [#5]

Site Tracking [#6]

Site Tracking [#7]

Site Tracking [#8]

bencroker commented 5 months ago

If that cache ID does not exist in the blitz_caches table, then no purge requests should be sent. How are you determining that the refresh is connected to the purges? And are you able to see what URL is being purged via CloudFront?

jan-dh commented 5 months ago

We are seeing this in the queue log and the timestamps match the invalidation requests.


 "[2024-06-18 03:18:48] Blitz cache refresh job added : {"cacheIds":{"5585":true},"elements":[]} : [/app/modules/BlitzRefreshCacheLogger.php:25,/app/vendor/yiisoft/yii2/base/Component.php:633,/app/vendor/yiisoft/yii2-queue/src/Queue.php:194,/app/vendor/craftcms/cms/src/queue/Queue.php:260,/app/vendor/craftcms/cms/src/helpers/Queue.php:56,/app/vendor/putyourlightson/craft-blitz/src/services/RefreshCacheService.php:339,/app/vendor/putyourlightson/craft-blitz/src/services/RefreshCacheService.php:582,/app/vendor/putyourlightson/craft-blitz/src/services/RefreshCacheService.php:472,/app/vendor/putyourlightson/craft-blitz/src/services/CacheRequestService.php:687,/app/vendor/putyourlightson/craft-blitz/src/services/CacheRequestService.php:479,/app/vendor/putyourlightson/craft-blitz/src/Blitz.php:329,/app/vendor/yiisoft/yii2/base/Event.php:312,/app/vendor/yiisoft/yii2/base/Component.php:642,/app/vendor/craftcms/cms/src/base/ApplicationTrait.php:531,/app/vendor/craftcms/cms/src/base/ApplicationTrait.php:1661,/app/vendor/craftcms/cms/src/web/Application.php:110,/app/vendor/yiisoft/yii2/base/BaseObject.php:109,/app/vendor/yiisoft/yii2/base/Application.php:204,/app/vendor/yiisoft/yii2/di/Container.php:419,/app/vendor/yiisoft/yii2/di/Container.php:170,/app/vendor/yiisoft/yii2/BaseYii.php:365,/app/vendor/craftcms/cms/src/Craft.php:59,/app/vendor/craftcms/cms/bootstrap/bootstrap.php:264,/app/vendor/craftcms/cms/bootstrap/web.php:40,/app/web/index.php:16]"```
bencroker commented 5 months ago

Are you able to see what URL is being purged via CloudFront?

jan-dh commented 5 months ago

Going after the logs, don't have direct access myself

jan-dh commented 5 months ago

Send through mail

bencroker commented 5 months ago

Thanks. From what I can tell, the logs show lots of valid URLs being purged, so no particular clues there.

What I think is happening is that the cache ID 5585 exists when the refresh cache job is executed, but no longer exists when you visit the CP, presumably because the cached page has been regenerated and therefore assigned a new ID.

jan-dh commented 5 months ago

The amount of invalidations with that missing cacheId in the DB (5585) from the server logs: 2,534 roughly matches the amount of validations of the homepage I find in the cloudfront logs: 2,537 (which is 3 more, which I manually triggered).

jan-dh commented 5 months ago

I also see some invalidation requests very very close after eachother:

  {
    "eventTime": "2024-06-18T08:58:22Z",
    "items": [
      "/nl-be/"
    ]
  },
  {
    "eventTime": "2024-06-18T08:56:19Z",
    "items": [
      "/nl-be/"
    ]
  },
  {
    "eventTime": "2024-06-18T08:55:21Z",
    "items": [
      "/nl-be/"
    ]
  },
  {
    "eventTime": "2024-06-18T08:55:20Z",
    "items": [
      "/nl-be/"
    ]
  },
  {
    "eventTime": "2024-06-18T08:54:39Z",
    "items": [
      "/nl-be/"
    ]
  },

So 5 invalidation requests for the same path within less than 4 seconds

bencroker commented 5 months ago

The amount of invalidations with that missing cacheId in the DB (5585) from the server logs: 2,534 roughly matches the amount of validations of the homepage I find in the cloudfront logs: 2,537 (which is 3 more, which I manually triggered).

So that looks like expected behaviour, nothing unusual there.

So 5 invalidation requests for the same path within less than 4 seconds

Within 4 minutes, if I’m reading that correctly, which seems reasonable.

I can understand that you’re wondering why you’re getting so many invalidation requests, but there doesn’t appear to be any bug here, as far as I can tell.

jan-dh commented 5 months ago

I'm struggling to see how 2.537k invalidation requests for an entry are "expected behaviour" if the entry itself is saved 3 times. Is there any way to get more insights as to what element could cause all these invalidations?

Correct about those 4 minutes, however again, the entry or linked elements were not resaved during that period of time.

bencroker commented 5 months ago

I'm struggling to see how 2.537k invalidation requests for an entry are "expected behaviour" if the entry itself is saved 3 times.

What I’m saying is that the number of invalidation requests appears to match the number of cache refreshes. As for why the homepage is refreshed so many times requires more digging. Would you able to log each time an element is saved, to help audit all content changes? I feel like that is the first step in understanding why the caches are being invalidated more often than you are expecting. You can adapt this module that I previously created for you, or let me know if you need help with it.

jan-dh commented 5 months ago

If you would have some time to spare to help with it that would be greatly appreciated. Not sure what it is exactly we want to log here.

bencroker commented 5 months ago

Here’s some code that will write to the Blitz log each time an element is added to the refresh data array. It can act as an audit for any time updated content is picked up by Blitz.

use putyourlightson\blitz\Blitz;
use putyourlightson\blitz\events\RefreshElementEvent;
use putyourlightson\blitz\services\RefreshCacheService;
use yii\base\Event;

Event::on(
    RefreshCacheService::class,
    RefreshCacheService::EVENT_AFTER_ADD_ELEMENT,
    function(RefreshElementEvent $event) {
        Blitz::$plugin->log('Element “' . $event->element->title . '” [' . $event->element->id . '] added to refresh data.');
    }
);
jan-dh commented 5 months ago

Okido, added this to the code base & logs are coming in good. Only thing I'm wondering is if this would've been present before, you would just see 2.5K times "Home added to the refresh data"? I'm guessing we're looking for the actual element reason why it was added?

bencroker commented 5 months ago

You can add whatever logging you want to help you get to the bottom of it. Look out for plugins or modules that can change content without human intervention, too (like Feed Me, for example, which appears to be installed).

jan-dh commented 4 months ago

I tried adding detailed loggin to Matrix functionality but that gives performances issues.

bencroker commented 4 months ago

It should only affect saves, not reads, but since matrix blocks cannot be saved independently of entries, I see no benefit to logging them.

bencroker commented 4 months ago

Marking as closed, since I believe you have since found a workaround/solution.