putyourlightson / craft-blitz

Intelligent static page caching for creating lightning-fast sites with Craft CMS.
https://putyourlightson.com/plugins/blitz
Other
147 stars 35 forks source link

"Refreshing Blitz Cache" jobs hang a lot #635

Closed boboldehampsink closed 3 months ago

boboldehampsink commented 3 months ago

Bug Report

There's a lot of "Refreshing Blitz Cache" jobs either not starting at all, and saying "Reserved" @ 0% - or never finishing after reaching 100%.

I'm using a queue worker to process them async via the CLI.

Plugin Version

4.13

Craft CMS Version

4.8

PHP Version

8.3

bencroker commented 3 months ago

Is this something that started happening recently? Can you provide any more details to help me start troubleshooting?

boboldehampsink commented 3 months ago

I feel like it started happening after the last updates. I was on 4.12 before and didn't notice it. Here is a screenshot of my current queue:

Screenshot 2024-03-15 at 19 58 35
bencroker commented 3 months ago

Why are there so many processes running at once? One refresh cache job should be handled by a single PHP process. I suggest you read the article on queue runners and compare this to your current queue runner setup. One possibility is that these queue jobs are all waiting on a mutex to free up, causing a race condition. https://putyourlightson.com/articles/queue-runners-and-custom-queues-in-craft-cms

boboldehampsink commented 3 months ago

I'm not sure. I do use multiple workers (I'm on Heroku) temporarily, so that if one stalls, others will continue. But I only use 3 workers - and there's more open processes. The workers only run craft queue/listen. They move on to the next job tho, at 100%, which is also weird? I think its because they crash, and the worker gets restarted and that moves it to the next job. I do have error reporting via Sentry also, which notifies me in case of a Mutex exception. But I didn't get any.

boboldehampsink commented 3 months ago

FWIW, just got this error from Sentry:

yii\base\ErrorException: Could not send result to parent; be sure to shutdown the child before ending the parent
  File "/vendor/amphp/parallel/lib/Context/Internal/process-runner.php", line 120, in yii\base\ErrorHandler::handleError
    \trigger_error("Could not send result to parent; be sure to shutdown the child before ending the parent", E_USER_ERROR);
  File "/vendor/craftcms/cms/src/web/ErrorHandler.php", line 79, in craft\web\ErrorHandler::handleError
    return parent::handleError($code, $message, $file, $line);
  File "[internal]", line 0, in trigger_error
  File "/vendor/amphp/parallel/lib/Context/Internal/process-runner.php", line 120, in Amp\Parallel\Context\Internal\{closure}
    \trigger_error("Could not send result to parent; be sure to shutdown the child before ending the parent", E_USER_ERROR);
  File "/vendor/amphp/parallel/lib/Context/Internal/process-runner.php", line 123
    })();

amphp/parallel is required and used by Blitz, right?

The latest 2.x releases of that package have descriptions like "Fixed hang during shutdown"(https://github.com/amphp/parallel/releases). Maybe its in this package?

Is there a reason not to update to 2.x btw? You're on 1.4. I'm assuming it is because of the PHP 8.1 requirement. But since last november, 8.0 is fully end-of-life, so that could be dropped.

bencroker commented 3 months ago

Your screenshot shows 8 reserved queue jobs, which would concern me. Keep in mind, too, that both the HTTP and local generators create their own child processes with a default concurrency of 3.

Yes, amphp/parallel 1.x is required. Blitz 4 works with Craft 4.0 and above, which requires PHP 8.0.2 and above, meaning that it’s not trivial to update AMPHP due to breaking changes. Blitz 5.0 requires PHP 8.2 and amphp/parallel 2.x.

With the debug setting enabled, these errors should be logged to blitz-****.log.

While AMPHP is a great library, I will concede that it has been challenging to work with due to the foreign concepts and light documentation.

boboldehampsink commented 3 months ago

I would like to try Blitz 5, is it compatible with Craft 4?

bencroker commented 3 months ago

No, Blitz 5 requires Craft 5. Also, the AMPHP error you‘re seeing is a cache generation error, nothing to do with refresh cache jobs. I’m working on batched driver jobs, which will include generate cache jobs, in https://github.com/putyourlightson/craft-blitz/tree/feature/batch-driver-jobs, which you‘re welcome to test by running ddev composer require "putyourlightson/craft-blitz:dev-feature/batch-driver-jobs as 4.14.0".

boboldehampsink commented 3 months ago

Will try that, thanks. Was wondering about Blitz 5 being compatible for Craft 4, as P&T have lots of their plugins updated to support both versions simultaneously - but hey, if there's anything Craft 5 specific I get it ;-)

bencroker commented 3 months ago

It’s mainly got to do with the PHP requirement version bump from 8.0.2 to 8.2, and the dependencies that come along with that. Craft 5 also has some new concepts such as matrix-in-matrix, that are not backwards compatible with Craft 4.

Like I said above though, this issue has nothing to do with AMPHP. It would be worth looking into why Blitz is tracking so many unique product variant queries (you can find them in the Blitz Diagnostics utility). If you could bring that number from over two thousand down to double digits, than that would likely make a big difference.

bencroker commented 3 months ago

FYI I’ve pulled all changes in to the develop branch, so you can test by running ddev composer require "putyourlightson/craft-blitz:dev-develop as 4.14.0".

boboldehampsink commented 3 months ago

@bencroker when running your latest develop, weird thing are happening to my site. After generating cache, my assets get served with a port number on the host (which is wrong as well, as Heroku uses SSL):

Screenshot 2024-03-17 at 19 23 09 Screenshot 2024-03-17 at 19 23 14

Is there anything that can cause this?

boboldehampsink commented 3 months ago

fwiw, im using the craft vite plugin, but that hasn't been updated this weekend

boboldehampsink commented 3 months ago

I'm also using the local generator - guess it has something to do with that? Only happens after generating cache so

boboldehampsink commented 3 months ago

640 might fix that, didn't test it yet tho - but found that making sense

boboldehampsink commented 3 months ago

Tested it and that fixed it. It's a different issue tho. Regarding the original issue - I flushed the blitz database, maybe it got clogged by testing. So far so good

bencroker commented 3 months ago

Thanks, I'll take a look and merge later today.

bencroker commented 3 months ago

Merged, thank you!

Regarding the original issue, try to ensure at most one queue job is executed at a time, and hopefully the next release will address any remaining issues for you. Thanks again for the contributions!