knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.46k stars 1.14k forks source link

Feature Request: Implement waitUntil-like Functionality in Knative/Serving #15345

Open kahirokunn opened 1 week ago

kahirokunn commented 1 week ago

Describe the feature

Context

In Vercel Functions, the waitUntil() method is a highly valuable feature. It allows developers to enqueue asynchronous tasks to be performed during the lifecycle of a request. These tasks do not block the response but should complete before the function shuts down. This is particularly useful for tasks such as logging, sending analytics, or updating a cache, which can be done after the response is sent, ensuring the response is not delayed by these operations.

Reference

Here is a description of the waitUntil method in Vercel Functions:

The waitUntil() method enqueues an asynchronous task to be performed during the lifecycle of the request. It doesn't block the response, but should complete before shutting down the function.

It's used to run anything that can be done after the response is sent, such as logging, sending analytics, or updating a cache, without blocking the response from being sent.

The package is supported in Next.js (including Server Actions), Vercel CLI, and other frameworks, and can be used with the Node.js and Edge runtimes.

https://vercel.com/changelog/waituntil-is-now-available-for-vercel-functions

Feature Request

I would like to request a similar feature in Knative/Serving. The implementation of a waitUntil-like method would enable asynchronous tasks to run during the lifecycle of a request without blocking the response. This functionality is essential for many use cases, such as logging, analytics, and cache updates, which need to be performed without delaying the response.

Use Cases

  1. Logging: Perform logging operations after the response is sent to avoid delaying the response time.
  2. Analytics: Send analytics data asynchronously to ensure it does not impact the response time.
  3. Cache Updates: Update cache entries asynchronously to enhance performance without blocking the response.

Benefits

Knative Eventing Consideration

While it is possible to achieve similar functionality using Knative Eventing, the setup and maintenance can be complex and time-consuming. Introducing a native waitUntil-like method in Knative/Serving would simplify the implementation process for developers who need asynchronous task execution without the overhead of configuring and managing Knative Eventing.

Conclusion

Implementing a waitUntil-like method in Knative/Serving would greatly enhance its capabilities and align it with the functionality provided by Vercel Functions. This feature would enable developers to perform necessary asynchronous tasks without impacting response times, thereby improving performance and user experience.

Thank you for considering this feature request.

dprotaso commented 6 days ago

waitUntil is very specific to the underlying application framework.

I'd imagine you could accomplish something similar by calling some function in your code (that doesn't block the response being sent).

You'd have to configure the revision to stay alive longer than the request which can be done via https://knative.dev/docs/serving/autoscaling/scale-bounds/#scale-down-delay and potentially set a TimeoutSeconds value

kahirokunn commented 6 days ago

@dprotaso

Using scale-down-delay will indeed introduce a delay even if there's no background processing occurring, which isn't optimal for cost-efficiency. Ideally, the delay should only be extended when background tasks are actually being processed.

A practical approach might be to consider a revision 'alive' based on its activity, such as outputting logs. For instance, you could require that logs be emitted at least once every N seconds during background processing.

Alternatively, the queue-proxy could expose a new API that allows it to be notified of ongoing background tasks. This way, you could make periodic calls (e.g., every 5 seconds) using curl or similar tools to indicate that the revision is still active during the background processing.