Closed naboo closed 1 week ago
It looks like multiple jobs are being executed concurrently, which is not ideal. Is the daemon configured to run at most one process at a time?
Read more about my recommendations for queue runners at https://putyourlightson.com/articles/queue-runners-and-custom-queues-in-craft-cms#queue-runners
The runner is set to only run one (1) process at at time:
So the task executes one job at a time (starts with job 1 of 18) then job 1 gets stuck at 100%. After a couple of seconds job 2 starts. It's only when the jobs gets stuck at 100% another job is started and never more than 1 job is being run at the same time.
Some more context: The jobs that gets stuck re-starts after a while (I don't know how long...) and ultimately finishes. But then they are being run all over again so the entire batch is being processed again. 1 of 18, 2 of 18, 3 of 18....
Super wierd :)
Are you trying to cancel jobs via the UI, by any chance? Currently, cancelling a batched job has little or no effect in Craft, see https://github.com/craftcms/cms/discussions/14986 for more details.
I'm not touching the queue at all. I just let them be. So basically when a job gets stuck at 100% it'll spawn the next "X of Y" and the first stuck job just sits there. When all jobs have been processed something happens (if the job just restarts or expires and restarts as "attempt = 2" I don't know) and the job starts to process again from the beginning.
Spawned batch jobs getting a higher priority than the initial job, to help ensure batches are completed one at a time. So that helps explain why batch 1 is only restarted when all other batches are complete. However, why batch 1 is not being removed from the queue when it reaches 100% makes no sense to me. Neither does the fact that you have multiple reserved queue jobs at the same time.
Can you try increasing the TTL of generate cache jobs using the queueJobTtr
setting, to reduce the probability of jobs timing out?
I'll try that @bencroker and report back. I'll set it to something super high. I have it at 2400
today so it's pretty high but let me try even higher.
Here's an update Ben. So I set 'queueJobTtr' => 3600,
and 'driverJobBatchSize' => 50,
. Still getting the first job stuck at 100% not clearing. Can it be a Craft CMS queue related issue and not Blitz related? Nothing else gets struck though.
Thanks
I’m not sure, it could just as easily be a Craft issue. Can you update to the latest versions of Craft and Blitz to help rule that out?
I'm on the latest 4.x branch right now for Craft. This is a pretty big site so the upgrade to 5.x will require some testing to get out but it's on the roadmap for sure.
It's also very strange that it goes through "the second time" when it restarts. Not just the first.
Everything also works as it should when Blitz cleares entries from inside the CP on element re-save etc.
Do we see the upgrade as the next logical step or anything else we might check before? If so we'll have to put this on pause for a little bit until we upgrade.
If you’re on the latest v4 (4.18.1), which is still being maintained, then it should be all good. At this stage, I’m tempted to say that the next logical step might be to contact Craft’s official support team, as I can’t see why this is happening and nobody else using Blitz has encountered this issue, as far as I’m aware.
Just adding that I'm seeing this on Craft 5.2.3, Blitz 5.3.2 - getting stuck on 100% on the first page of the batch.
This is when generating cache, and there is only one Blitz job in the queue.
Is there anything else I can provide here?
{
"debug": false,
"hintsEnabled": false,
"cachingEnabled": true,
"refreshCacheEnabled": true,
"refreshMode": 1,
"includedUriPatterns": [
{
"siteId": "",
"uriPattern": ".*"
}
],
"excludedUriPatterns": [],
"cacheStorageType": "putyourlightson\\blitz\\drivers\\storage\\FileStorage",
"cacheStorageSettings": {
"folderPath": "@webroot\/cache\/blitz",
"compressCachedValues": true,
"countCachedFiles": true
},
"cacheStorageTypes": [],
"cacheGeneratorType": "putyourlightson\\blitz\\drivers\\generators\\HttpGenerator",
"cacheGeneratorSettings": {
"concurrency": 3
},
"cacheGeneratorTypes": [],
"customSiteUris": [],
"cachePurgerType": "putyourlightson\\blitz\\drivers\\purgers\\DummyPurger",
"cachePurgerSettings": [],
"cachePurgerTypes": [],
"deployerType": "putyourlightson\\blitz\\drivers\\deployers\\DummyDeployer",
"deployerSettings": [],
"deployerTypes": [],
"ssiEnabled": false,
"ssiTagFormat": "<!--#include virtual=\"{uri}\" -->",
"detectSsiEnabled": true,
"esiEnabled": false,
"queryStringCaching": 1,
"includedQueryStringParams": [],
"excludedQueryStringParams": [],
"apiKey": "",
"generatePagesWithQueryStringParams": true,
"purgeAssetImagesWhenChanged": true,
"refreshCacheAutomaticallyForGlobals": true,
"refreshCacheWhenElementMovedInStructure": true,
"refreshCacheWhenElementSavedUnchanged": false,
"refreshCacheWhenElementSavedNotLive": false,
"cacheNonHtmlResponses": false,
"trackElements": true,
"trackElementQueries": true,
"excludedTrackedElementQueryParams": [],
"cacheDuration": null,
"nonCacheableElementTypes": [],
"sourceIdAttributes": [],
"liveStatuses": [],
"integrations": [
"putyourlightson\\blitz\\drivers\\integrations\\CommerceIntegration",
"putyourlightson\\blitz\\drivers\\integrations\\FeedMeIntegration",
"putyourlightson\\blitz\\drivers\\integrations\\SeomaticIntegration"
],
"defaultCacheControlHeader": "no-store",
"cacheControlHeader": "public, max-age=31536000",
"cacheControlHeaderExpired": "public, max-age=5",
"sendPoweredByHeader": true,
"outputComments": true,
"refreshCacheJobPriority": 1,
"driverJobBatchSize": 100,
"driverJobPriority": 0,
"queueJobTtr": 900,
"maxRetryAttempts": 10,
"maxUriLength": 255,
"mutexTimeout": 1,
"commands": [],
"injectScriptEvent": "DOMContentLoaded",
"injectScriptPosition": 3
}
Hi all,
I had a conversation with Ben about this on the Craft discord and it turned out that in the blitz config file, there is a description which says the following for the driverJobPriority
setting:
The priority to give driver jobs (the lower the number, the higher the priority). Set to
null
to inherit the default priority.
However, this is wrong. It needs to be an integer. Ben is updating this on the next release.
Stepping through the code, I saw that with a priority level of 0, there is some logic with then takes this down to -1 and on trying to write it into the db, it throws an error.
Bug Report
My "Generating Blitz cache" jobs gets stuck at 100%. At first it was always the #1 job in the queue but now even random jobs in the middle of the generation gets stuck. Here's a screenshot when 2 jobs have gotten stuck (concurrency is set to "1").
It usually happens during deployment. I've tried changing the deployment steps from:
.. to:
.. but it doesn't do any difference. It usually works when I manually click the steps from Craft CP. I've experimented with restarting and not restarting the deamon which runs the jobs on the server but same result both ways. Deamon runs with:
/usr/bin/nice -n 10 /usr/bin/php /home/craft/www.domain.com/craft queue/listen --verbose
I don't really see anything in the logs, either Blitz and/or Craft. No server errors. This is from the database:
Diagnostics Report
Host www.ploi.io
Craft CMS "craftcms/cms": "4.8.9" "putyourlightson/craft-blitz": "4.16.2",