brefphp / bref

Serverless PHP on AWS Lambda
https://bref.sh
MIT License
3.17k stars 366 forks source link

PHP-FPM children die with SIGPIPE since 8.3.10 #1854

Open driskell opened 2 months ago

driskell commented 2 months ago

Description:

See: https://github.com/brefphp/aws-lambda-layers/pull/192

We're seeing lots of random failures in 2.3.4 and 2.3.3 where the child dies of SIGPIPE at the start of a request - meaning something dropped a SIGPIPE prior to the request.

Error communicating with PHP-FPM to read the HTTP response. Bref will restart PHP-FPM now. Original exception message: hollodotme\FastCGI\Exceptions\ReadFailedException Stream got blocked, or terminated.

Followed immediately by (two examples, showing it happens different timings):

[28-Aug-2024 11:03:08] WARNING: [pool default] child 11 exited on signal 13 (SIGPIPE) after 49.066900 seconds from start
[28-Aug-2024 11:03:16] WARNING: [pool default] child 71 exited on signal 13 (SIGPIPE) after 7.462262 seconds from start

Then finally:

{
    "errorType": "Bref\\FpmRuntime\\FastCgi\\FastCgiCommunicationFailed",
    "errorMessage": "",
    "stack": [
        "#0 /var/task/vendor/bref/bref/src/Event/Http/HttpHandler.php(25): Bref\\FpmRuntime\\FpmHandler->handleRequest(Object(Bref\\Event\\Http\\HttpRequestEvent), Object(Bref\\Context\\Context))",
        "#1 /var/task/vendor/bref/bref/src/Runtime/Invoker.php(24): Bref\\Event\\Http\\HttpHandler->handle(Array, Object(Bref\\Context\\Context))",
        "#2 /var/task/vendor/bref/bref/src/Runtime/LambdaRuntime.php(94): Bref\\Runtime\\Invoker->invoke(Object(Bref\\FpmRuntime\\FpmHandler), Array, Object(Bref\\Context\\Context))",
        "#3 /var/task/vendor/bref/bref/src/FpmRuntime/Main.php(46): Bref\\Runtime\\LambdaRuntime->processNextEvent(Object(Bref\\FpmRuntime\\FpmHandler))",
        "#4 /opt/bref/bootstrap.php(17): Bref\\FpmRuntime\\Main::run()",
        "#5 {main}"
    ]
}

Similar to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280775 - which seems to refer to libxml

How to reproduce:

Install 2.3.4 or 2.3.3 Bref, use php-fpm as the runtime and send lots of requests.

I'm not entirely sure if this will reproduce it as I haven't tried but it definitely all goes away when downgrading back to 2.3.1.

mnapoli commented 2 months ago

Thanks for the report! Any idea what makes you think this is related to libxml specifically? I don't see how xml could play a role in the FPM connection.

Could it be a regression in the new PHP version?

driskell commented 2 months ago

@mnapoli It could be. I only mention libxml because of https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280775.

Usually php-fpm would disable SIGPIPE, but I've found sources showing there's been things in past where libraries re-enable it (such as curl, or OpenSSL) and then php-fpm signal handler gets confused and kills things. So it could be something is re-enabling the signal and I think libxml was related before based on the above bugzilla.

Could be worth me trying building PHP 8.3.10 separately against older libxml and I could test but not sure where to start right now. We've stuck to 2.3.1 on 8.3.9 for now until I can work out time to play around with it.

mnapoli commented 2 months ago

Interesting! Adding @GrahamCampbell to this

GrahamCampbell commented 2 months ago

Odd, though I don't think this is libxml related, because Bref's 8.3.9 is using 2.12.x, and that linked issue says that 2.11.x is good, but 2.12.x is not.

GrahamCampbell commented 2 months ago

We could try downgrading to 2.12.x, but we may need to re-evaluate in the future if a security issue is discovered in 2.12.x in the future, and not patched.

GrahamCampbell commented 2 months ago

Actually, no. That issue is specifically about PHP 8.1, and libxml 2.13 does not work on that version. We only use 2.13 on PHP 8.2+.

mnapoli commented 2 months ago

So it could be the PHP version. We could try waiting for https://github.com/brefphp/aws-lambda-layers/pull/199 and see if it fixes it?

driskell commented 2 months ago

I can try and test with https://github.com/brefphp/aws-lambda-layers/pull/199 once it is released.

mnapoli commented 2 months ago

@driskell can you test with https://github.com/brefphp/bref/releases/tag/2.3.5 ?

gmorel commented 2 months ago

Maybe it could help you. I had the same issue with https://github.com/reactphp/http and AWS Lambda

RequestId: 5ad1b639-980b-4f3c-baf6-7ac7d40971ee Error: Runtime exited with error: exit status 141
Runtime.ExitError

Worked perfectly locally with my docker container but crashing on the AWS infra (Lambda as a Docker Image)

I manage to fix it with:

        // Try to handle SIGPIPE and other potential signals gracefully
        \pcntl_async_signals(true);
        \pcntl_signal(SIGPIPE, function() {
            echo 'Caught SIGPIPE, closing socket.' . PHP_EOL; // Remove this line in prod
        });

With \pcntl_async_signals(true);, you enable asynchronous signal handling for your program, allowing it to catch and handle signals. With the \pcntl_signal(SIGPIPE, ...), you specify a handler function for SIGPIPE.

I don't know what you think. And I don't know the impact. But since pcntl extension is by default in bref layer, maybe we could add it directly into bref.

driskell commented 2 months ago

@mnapoli Interesting no issues with that one. All working.

driskell commented 2 months ago

@mnapoli I stand correct - I am still seeing some issues with SIGPIPE. It just more sporadic now and I don't know if that is related to the 2.3.5 or if not. I've had to rollback to 2.3.1 and it works again smoothly.