brefphp / bref

Serverless PHP on AWS Lambda
https://bref.sh
MIT License
3.05k stars 364 forks source link

Performance issues on Bref PHP 8.1 and Later #1528

Open yh1224 opened 1 year ago

yh1224 commented 1 year ago

I am experiencing a significant performance issue with the AWS SDK for PHP when running on Bref PHP 8.1 and subsequent versions. Specifically, I have observed that accessing AWS with the SDK is approximately three times slower as compared to Bref PHP 8.0.

To further analyze this, I conducted the same processes on an EC2 instance and found no noticeable performance difference between them. The version of the AWS SDK for PHP I am using is 3.269.9.

Moreover, I conducted a test measuring the time it took to execute 100 queries on DynamoDB. Here are the results:

Environment Specification Elapsed
EC2 PHP 8.0.28 c5.large 7.5 sec
EC2 PHP 8.1.18 c5.large 7.5 sec
EC2 PHP 8.2.5 c5.large 7.5 sec
Bref PHP 8.0.28 (php-80:44) Memory: 1024 MB 4 sec
Bref PHP 8.1.18 (php-81:43) Memory: 1024 MB 11 sec
Bref PHP 8.2.5 (php-82:33) Memory: 1024 MB 11 sec
mnapoli commented 1 year ago

Thanks for the report and the details!

The one thing that jumps to mind would be OpenSSL 3, that was applied to PHP 8.1 and 8.2: https://github.com/brefphp/aws-lambda-layers/pull/74

A quick search yield these issues for example:

Would you be able to run the same test with Bref 2.0.2 (here are the layer versions: https://runtimes.bref.sh/?region=us-east-1&version=2.0.2)? (this is before OpenSSL 3)

cc @GrahamCampbell if that rings a bell to you?

yh1224 commented 1 year ago

@mnapoli

I tried using the Layer compatible with Bref 2.0.2 and there was no degradation in performance as shown below.

Environment Layer Elapsed
Bref 2.0.2 / PHP 8.0.28 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-80:42 4 sec
Bref 2.0.2 / PHP 8.1.17 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-81:41 4 sec
Bref 2.0.2 / PHP 8.2.4 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-82:31 4 sec
Bref 2.0.4 / PHP 8.0.28 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-80:44 4 sec
Bref 2.0.4 / PHP 8.1.18 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-81:43 11 sec
Bref 2.0.4 / PHP 8.2.5 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-82:33 11 sec

I would like to use this older version for now. Thank you very much.

GrahamCampbell commented 1 year ago

Interesting, I have not noticed any excess latency on my workloads (though my workloads are super spiky, so it's hard to notice), and I have some that send 1000s of requests to dynamo using the symfony curl http client with a parallelism of 64 (a curl parameter). I am reluctant to suggest that we rollback to openssl 1.1 because of the fact that it will be EOL very soon, leaving us without any security patches. I wonder if we should do a build with openssl 3.1, and see if it also have the same performance issues for @yh1224.

GrahamCampbell commented 1 year ago

Moreover, I conducted a test measuring the time it took to execute 100 queries on DynamoDB.

Are you re-using the client? Can you share a minimal re-producer please?

GrahamCampbell commented 1 year ago

Is the same issue present on arm64 - that is, is this just an issue for x86_64?

yh1224 commented 1 year ago

@GrahamCampbell

Are you re-using the client? Can you share a minimal re-producer please?

I create the DynamoDbClient each time.

Is the same issue present on arm64 - that is, is this just an issue for x86_64?

Same with arm64. I feel that it is a little more slower than x86_64.

Environment Layer Elapsed
arm64 / Bref 2.0.2 / PHP 8.0.28 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-80:43 4s
arm64 / Bref 2.0.2 / PHP 8.1.17 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-81:22 4s
arm64 / Bref 2.0.2 / PHP 8.2.4 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-82:19 4s
arm64 / Bref 2.0.4 / PHP 8.0.28 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-80:45 4s
arm64 / Bref 2.0.4 / PHP 8.1.18 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-81:24 13s
arm64 / Bref 2.0.4 / PHP 8.2.5 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-82:21 13s

Here is the CDK project to check. https://github.com/yh1224/aws-cdk-lambda-bref-test

The source is below. https://github.com/yh1224/aws-cdk-lambda-bref-test/tree/main/src/TestFunc

GrahamCampbell commented 1 year ago

I create the DynamoDbClient each time.

This is probably the issue. You should avoid doing that. I think what's happening here is that the TLS handshake has to happen over and over. That would also explain why I haven't noticed this. Not reusing connections is very expensive.

GrahamCampbell commented 1 year ago

https://github.com/openssl/openssl/issues/17064

I think this issue is unrelated. There is only one thread here.

mnapoli commented 1 year ago

This is probably the issue. You should avoid doing that. I think what's happening here is that the TLS handshake has to happen over and over. That would also explain why I haven't noticed this. Not reusing connections is very expensive.

I would argue that:

Fact is that OpenSSL 3 has an issue. Reusing connections is just a workaround, and it's not good for most PHP apps.

GrahamCampbell commented 1 year ago

I strongly disagree that it is a workaround. Setting up a new TLS connection is an expensive thing to do, and application developers should always reuse connections, in the same way that it is standard practice to reuse database connections.

GrahamCampbell commented 1 year ago

Most PHP apps that I have seen do re-use connections. Anyone using the Laravel Guzzle wrapper will have connections re-used. Similarly, anyone using any Laravel native AWS implementations like SES or SQS will also have connections re-used.

GrahamCampbell commented 1 year ago

Not re-using connections and not cleaning up can lead to serious issues like exceeding the max open sockets limit on Lambda (I have done this before :trollface:).

yh1224 commented 1 year ago

@GrahamCampbell

I understand that this is not an issue with PHP/Bref, but is due to OpenSSL 3.

I'm aware that reusing the client is inefficient, but due to the legacy code of our current production being migrated to Lambda, it seems we can't address it immediately. As the difference in execution time directly affects the cost of using Lambda, I still want to hold out with the old version at this time.

If the performance degradation with OpenSSL is due to some "problem", I would like to wait for its resolution.

Thank you very much.

mnapoli commented 1 year ago

@GrahamCampbell maybe we are not talking about the same thing. Maybe you are talking about reusing the client in the same request? I am talking about reusing the client between Lambda invocations.

Most of the HTTP requests served with Bref are with FPM. That means each request has to do the SSL connection again (because the client cannot be kept alive). This is why I'm saying OpenSSL 3 introduces a performance regression that impacts a majority of Bref users.

GrahamCampbell commented 1 year ago

Maybe you are talking about reusing the client in the same request?

Yes.

I am talking about reusing the client between Lambda invocations.

I see. I also do that, because I don't use php-fpm, but I appreciate many people don't.

This is why I'm saying OpenSSL 3 introduces a performance regression that impacts a majority of Bref users.

That is not so clear. We'd have to do some more investigation, maybe profiling with blackfire.

M1ke commented 6 months ago

In case it's useful, here's a Linux (Ubuntu) serverful example of this openssl 1.1 vs 3 issue when it comes to performance of PHP scripts making web requests: https://github.com/php/php-src/issues/10753#issuecomment-1852335196

mnapoli commented 6 months ago

@M1ke thanks for sharing. We have hopes that OpenSSL 3.2 might fix things, but that needs to be tested.

atrope commented 5 months ago

Hey @mnapoli I've just found out that the performance issues i told you in the other issue is because we are making https calls.

When i can make the https in plain http the performance is the same(or better) as before. Unfortunately there are some calls that we can't perform in http.

How can we test the new OpenSSL3.2? Maybe it will fix those issues. I was seeing on average 400ms with https calls and with plain http it is now 120ms on average.

With the fix i have hopes to make it under 100ms as it was before.

Thanks

GrahamCampbell commented 5 months ago

We are waiting for the next release of postgres 15.x. The current release does not support OpenSSL 3.2.

atrope commented 5 months ago

@GrahamCampbell is it possible to generate a layer without postgres if we don't use it? Maybe we could have an specific version without postgres because 4x the pricing for using https is a lot.

atrope commented 5 months ago

Or do you know which version of php-82-fpm has openssl 3.2? That way i can test in a stressfull env anc check if this is the actual problerm

GrahamCampbell commented 5 months ago

You can compile a layer yourself. Otherwise, it'll probably be around a month from now when OpenSSL 3.2.1 gets put into Bref, in combination with Postgres 15.6. It'll likely be PHP 8.1, 8.2 and 8.3. PHP 8.0 will remain on OpenSSL 1.1.1.

atrope commented 5 months ago

Okay, i've now tested in 1M requests the version 31 of the layer which has OpenSSL1.1.1 - arn:aws:lambda:us-east-1:534081306603:layer:php-82-fpm:31

Only by changing the newest one (58) to the old one I see the performance increase of 50%. That said I really think we should not keep the OpenSSL3 in the production version of the Layer.

As we all know a function running on average in 140ms is 100% more expensive than one running at 70ms.

Screenshot 2024-01-29 at 16 54 56

GrahamCampbell commented 5 months ago

The problem with OpenSSL 1.1.1 is that it has unpatched security vulnerabilities.

mnapoli commented 5 months ago

Hey @atrope! I've seen your other message, thanks for bumping this issue. Yeah, that is tempting to go back to OpenSSL 1.1.1 😞

it'll probably be around a month from now when OpenSSL 3.2.1 gets put into Bref, in combination with Postgres 15.6

Sounds like the best course of action right now unfortunately

GrahamCampbell commented 5 months ago

And that time has arrived. 🎉

  1. https://github.com/brefphp/aws-lambda-layers/pull/161
  2. https://github.com/brefphp/aws-lambda-layers/pull/162

Probably we can wait till next week to release this stuff (assuming those PRs are accepted) so that we can also include cURL 8.6.1 and the next round of PHP patch releases, both are due to land next week.

mnapoli commented 4 months ago

It's out in Bref 2.1.15 https://github.com/brefphp/bref/releases/tag/2.1.15 🎉

I'd love to hear if things are working better with this new version (which uses OpenSSL 3.2).

atrope commented 4 months ago

Hi!

Just tested it out the latest version.

We had an average of 75ms with layer 31 and with layer 61(php-82-fpm) it increased to ~85ms. Tested in the last 15minutes period.

Will leave it for the next days and check how it will perform.

GrahamCampbell commented 4 months ago

What about the performance of layer 60 vs 61? Also, how does P99.9 compare - for me this was the big improvement, assuming you are reusing connections between invokes.