Better fix for scheduling logic for transport close() and abort()

sanic-org / sanic

Accelerate your web app development | Build fast. Run fast.

https://sanic.dev

MIT License

18.11k stars 1.55k forks source link

Better fix for scheduling logic for transport close() and abort() #2973

Closed ashleysommer closed 4 months ago

ashleysommer commented 4 months ago

Fixes #2921 #2890 Replaces #2966

Rather than simply calling abort() at a set interval after calling close(), we implement a mechanism to detect if any data is still in the transport write buffer after close is called, and schedules an async close operation to wait until the transport has finished, or finally calls abort after a reasonable timeout.

For more information, see my recent in-depth comments on https://github.com/sanic-org/sanic/issues/2921

This does not modify any existing Protocol transport logic (apart from the ill-placed abort()), does not introduce any breaking changes, and all tests pass without modification.

ashleysommer commented 4 months ago

I think there needs to be additional communication (in the docs?) about ideally using a Sanic Streaming Response to chunk the response data when a response payload is larger than approx 16kb, rather than writing the whole payload in a single HTTP Response body. That would prevent the UVLoop UVStreamTransport from having situations where it cannot send the whole buffer in one go, so preventing this kind of error occurring that we were seeing here.

Though I know often its not possible to know the size of the response, eg, a dict passed to json() response could easily be a couple hundred KB without the user realizing it.

ahopkins commented 4 months ago

Beautiful. Thanks for taking this on @ashleysommer. I wonder if we should add a configurable option to automatically kick over to streaming responses. The benefit to the dev is obviously not having to worry about sizing.

I'll give this a look later tonight and get it merged. We can backport to the LTS also.

robd003 commented 4 months ago

Thank you so much for taking the time to debug this @ashleysommer You're a rock star!

ahopkins commented 4 months ago

I followed your logic and code thru. It is a great analysis. Thanks for taking the time to jump on this one @ashleysommer. :muscle:

ashleysommer commented 4 months ago

I agree with the addition of the GRACEFUL_TCP_CLOSE_TIMEOUT. I was going to add that change to this PR, but I didn't want this diff to grow too large and affect other parts of sanic (I know adding new tuneables in the settings can be hard to get right). Thanks for adding it.

One further thing I would suggest, on startup during sanity-checks we could verify that GRACEFUL_TCP_CLOSE_TIMEOUT is less than GRACEFUL_SHUTDOWN_TIMEOUT, and issue a warning if it is not. Or force:

GRACEFUL_TCP_CLOSE_TIMEOUT=min(GRACEFUL_SHUTDOWN_TIMEOUT, GRACEFUL_TCP_CLOSE_TIMEOUT)

ahopkins commented 4 months ago

Forcing that would be awkward because it wouldn't behave as expected. Warning or error makes it clear. But I doubt it will be a value anyone will have any need to configure, so it's very unlikely we'd need this. Startup checks are however a good idea.