Support Customized PHP Runners

deminy commented 1 year ago

Hello! I have created a draft patch along with a few examples to enhance the support for asynchronous extensions/frameworks such as Swoole and OpenSwoole. You can find the patch at https://github.com/deminy/customized-runner-for-bref .

As the patch involves modifications to the bootstrap script or certain internal classes, I have committed it to a personal repository rather than submitting a pull request directly. If you have any questions or suggestions for improvement, please don't hesitate to reach out. I am more than willing to assist in any way I can.

Thanks

mnapoli commented 1 year ago

Hi @deminy, thanks a lot for all these details. I have a few ideas that we can discuss regarding the bootstrap/runner/runtime, but before that I think it's better to discuss about the 4 examples first.

Example 4 shows the clear benefit of concurrent IO operations -> all good.

For example 3, async operations are not "awaited" before they are finished, and the response is returned to Lambda beforehand. When that happens, Lambda freezes the environment. The coroutines will (unless I am mistaken) never run in the background, unless the same Lambda instance is called by a consecutive request. This is not guaranteed (Lambda instances can be terminated at any time), so if the IO operations are important (e.g. writing to a database, etc.), they might not run successfully at all.

I think the Swoole runner should make sure there are no pending coroutines at the end of an invocation. This is what the NodeJS runtime does:

https://docs.aws.amazon.com/lambda/latest/dg/nodejs-handler.html#nodejs-handler-callback

They do have the option though, but it's not enabled by default as I guess this can be very confusing for users:

https://docs.aws.amazon.com/lambda/latest/dg/nodejs-context.html

deminy commented 1 year ago

Hi @mnapoli , thanks for your feedback. I didn't provide enough clarity in example 3, and I can see how some of the concerns you've mentioned are indeed valid. To address these issues, I'll update the README.md file for better clarity and understanding.

1

For example 3, async operations are not "awaited" before they are finished, and the response is returned to Lambda beforehand. When that happens, Lambda freezes the environment.

Yes, example 3 functions similarly to setting the context.callbackWaitsForEmptyEventLoop property to false in a Node.js runtime. However, it's important to note that Lambda doesn't immediately freeze the environment upon completing the processing of an invocation. ChatGPT offers the following explanation:

After a Lambda function is invoked, AWS may decide to keep the runtime container "warm" for some time, which helps reduce the latency of subsequent invocations. However, there is no fixed duration for how long the runtime container will stay warm after an invocation is processed. AWS manages the lifecycle of the container based on factors like resource utilization and demand.

The duration a runtime container stays warm can vary depending on factors like how frequently your Lambda function is being invoked and the overall resource usage of your account. Generally, the container may be kept warm for a few minutes up to several hours.

2

The coroutines will (unless I am mistaken) never run in the background, unless the same Lambda instance is called by a consecutive request.

Not exactly. The coroutine will continue running in the background until it either completes, encounters a Lambda timeout, or the Lambda execution environment is terminated due to a lack of invocations. That's why in the Lambda response I had a field count included to count total number of active coroutines within the execution environment.

By invoking the Lambda function multiple times in quick succession, you'll notice the counter increases by 5 with each invocation. This indicates that coroutines from previous invocations are still actively running within the same environment.

3

This is not guaranteed (Lambda instances can be terminated at any time), so if the IO operations are important (e.g. writing to a database, etc.), they might not run successfully at all.

Yes. we need to be careful when using coroutines to run background tasks within a Lambda environment, mainly because of the following two reasons:

A Lambda environment automatically shuts down after a certain period of inactivity.
A Lambda environment can persist for a maximum of 15 minutes.

Thus, we should use coroutines to run short background tasks only, and we should do it only in the following scenarios:

The background tasks are non-critical, allowing them to run in the background without concern for their completion status.
Alternatively, the Lambda function experiences continuous consecutive invocations. However, this approach can be more complex, as it's necessary to ensure that background tasks aren't interrupted by Lambda's timeout. Additional adjustments may be required for optimal functioning. On the other hand, if there are too many invocations, long-running ECS tasks could be a better choice.

4

I think the Swoole runner should make sure there are no pending coroutines at the end of an invocation.

In general, the answer is yes, although not in every case. My reply in the previous section provides a more detailed explanation.

Please feel free to let me know if you have any questions. Thanks

mnapoli commented 1 year ago

1

ChatGPT is correct, but this is not what I am talking about. A "warm" container has its code completely frozen. Let me clarify:

As soon as PHP returns a response, the entire Linux environment and all processes running in the Lambda instance are frozen. Code does not run anymore.

https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtime-environment.html#runtimes-lifecycle-invoke

In your test, pending coroutines only run again because the next invocation thaws the environment/processes, and code runs again.

However, it's important to note that Lambda doesn't immediately freeze the environment upon completing the processing of an invocation.

It does immediately freeze the environment. What ChatGPT is talking about is that the Lambda instance is kept (frozen) in memory so that Lambda can handle further requests faster.

To verify this, try the following:

edit example 3 so that at the end of the coroutine (after the sleep), a log is written
deploy example 3
call the API with 1 request only
wait
check if the log is written in cloudwatch

The log will not be written, because the coroutine never runs.

Yes. we need to be careful when using coroutines to run background tasks within a Lambda environment, mainly because of the following two reasons:

A Lambda environment automatically shuts down after a certain period of inactivity.

A Lambda environment can persist for a maximum of 15 minutes.

"A Lambda environment can persist for a maximum of 15 minutes." -> this is not related. A single invocation can run for 15 minutes, but in the case of an HTTP API it can only run for 29 seconds. I think this is a misinterpretation of what these 15 minutes are, it's for an "INVOKE" phase. The Lambda environment can be alive for hours if it is invoked by multiple requests.

Also:

A Lambda environment is destroyed when the application is redeployed
A Lambda environment is destroyed when the PHP runtime has a hard crash
A Lambda environment is destroyed when the application hits the memory limit
A Lambda environment can be destroyed whenever AWS decides the EC2 running it is overloaded
etc.

The important conclusion here is: Lambda environments can be destroyed at any time for many reasons.

Thus, we should use coroutines to run short background tasks only, and we should do it only in the following scenarios:

The background tasks are non-critical, allowing them to run in the background without concern for their completion status.

Yes. Do you have examples of such tasks? These need to be tasks that have no guarantee of running/finishing, I'm not sure there are many practical examples?

Alternatively, the Lambda function experiences continuous consecutive invocations. However, this approach can be more complex, as it's necessary to ensure that background tasks aren't interrupted by Lambda's timeout. Additional adjustments may be required for optimal functioning. On the other hand, if there are too many invocations, long-running ECS tasks could be a better choice.

Because of what I mentioned above, there is no guarantee tasks would run at all.

It is only viable for unimportant tasks that have no guarantees of running. I don't see any real use case for this to be honest, but I'm interested if you have some in mind?

deminy commented 1 year ago

Example 3 was to illustrate the potential capabilities within the same execution environment after sending a Lambda response back, regardless of whether the next invocation occurs or not. However, it may not be the best representation of what happens when Lambda freezes an execution environment. To address this, I've created a new example (example 5), where a "Hello World" message is printed one second after the Lambda response is sent back. The messages can be found in CloudWatch logs.

I acknowledge that example 3 might not be particularly realistic for real-world scenarios, and I don't have any specific use cases for this. The Node.js community might offer some insights about asynchronous processing in AWS Lambda.

mnapoli commented 1 year ago

Understood!

To clarify, if I understand correctly, example 5 the same as example 3 with a shorter sleep (1s instead of 60s)? (and a log line)

And to clarify, can you confirm that this is what is happening:

invocation 1: the response is sent quickly (a few milliseconds), but the Hello world! ... messages are not logged to CloudWatch
invocation 2: the log messages from invocation 1 are written to CloudWatch

What I want to confirm is that:

when the response from invocation 1 is sent, coroutines are not running anymore
if the invocation 2 happens 30s after invocation 1, then log messages are written after 30s instead of 1s (sleep(1) becomes irrelevant because the code was frozen for 30s)
if invocation 2 never happens in the same Lambda instance (for any reason mentioned in https://github.com/brefphp/bref/issues/1499#issuecomment-1500092712), then the coroutine never runs and the logs are never written

Is my understanding correct?

deminy commented 1 year ago

Yes, example 5 the same as example 3 with a shorter sleep (1s instead of 60s)? (and a log line). Besides that, example 5 doesn't have environment variable BREF_LOOP_MAX specified.

For example 5, we can submit one invocation only, and the messages from invocation 1 are written to CloudWatch. The response is sent back immediately (less than 1 second) via the Lambda API /response; however, script bootstrap.php finishes execution only after all coroutines finish execution. This means the Lambda function won't be able to handle next invocation until 1 second after, even the response is sent back immediately.

So for the following statements/questions:

when the response from invocation 1 is sent, coroutines are not running anymore False.
if the invocation 2 happens 30s after invocation 1, then log messages are written after 30s instead of 1s (sleep(1) becomes irrelevant because the code was frozen for 30s) I'd like to ignore this one because invocation 2 is irrelevant to example 5.
if invocation 2 never happens in the same Lambda instance, then the coroutine never runs and the logs are never written False.

To help better understand it, I've updated example 5 a little bit to print out more debugging messages, and added a new example (example 6) for comparison. Please note that example 5 and example 6 are exactly the same, except that they always run in different execution environments.

If we invoke the Lambda function of example 5 once every couple seconds, we should see that:

All responses have same value in field rand, which means all invocations are handled by the same execution environment.
The responses are sent back immediately, but script bootstrap.php finishes execution only after all coroutines finish execution.

There is one more thing I'd like to mention. By default, Bref quits the PHP execution environment after each invocation, unless we explicitly set BREF_LOOP_MAX to an integer greater than 1. When using an asynchronous runner like Swoole, it's better to have BREF_LOOP_MAX set to a large value. Let's use example 5 as an example, and let's assume there are continuous invocations to the Lambda function:

If we ignore BREF_LOOP_MAX or set it to 1, a single execution environment can handle 1 invocation each second.
If we set BREF_LOOP_MAX to 10, a single execution environment can handle ~10 invocations each second, theoretically (In reality it won't handle 10 invocations because there are 20 additional REST API calls to Lambda, plus other time-consuming operations). I didn't run any tests about this but that's what I'd expect.

Please feel free to let me know if there are any questions. Thanks

mnapoli commented 1 year ago

Thanks a lot for digging in! That is very interesting 🤔

Just a heads up: I will be on holidays next week, so I will be able to look back into it later!

brefphp / bref

Support Customized PHP Runners #1499

1

2

3

4

1