sysid / sse-starlette

BSD 3-Clause "New" or "Revised" License
544 stars 37 forks source link

Fix race condition between ping and stream_response #55

Closed ejlangev closed 1 year ago

ejlangev commented 1 year ago

The bad sequence of events was as follows:

  1. Coroutine for stream_response calls send to close the request and blocks somewhere within on actually doing the sending which allows other coroutines to run. At this point the
  2. Coroutine for ping wakes up from its sleep and also calls send. This raises an exception because the stream_response call had tried to close the request but had not yet been able to return in order to cancel everything else.

This is relatively unlikely to happen but we were observing it regularly on our API which is doing a lot of SSE responses.

sysid commented 1 year ago

Thank you @ejlangev for your PR. It makes a lot of sense.

I am about merging it. However I am not sure what exactly the test_ping_concurrency is demonstrating.

May I ask you to:

  1. either add some assert statements
  2. or to add some comments to demonstrate what is actually being proven by the test

I also merged the latest PR into your branch. Hope this is ok.

Regards, sysid

ejlangev commented 1 year ago

I'll add some comments to the test, basically it would throw a WouldBlock error if it were to exhibit the race conditions and the sleep lengths are designed to have the ping task wake up while the stream_response is sending. Can definitely make it more clear.

ejlangev commented 1 year ago

Comments added explaining the sequencing and how this tests prevents it because it would raise a WouldBlock error if there were concurrent access to send.

sysid commented 1 year ago

@ejlangev , due to a new discussion 77 I tried to understand your use-case better and reproduce race conditions on my side. Unfortunately I could not.

My understanding is that the send coroutine when using FastAPI/Starlette should be safe to be called from different coroutines in parallel. Is this a wrong assumption?

ejlangev commented 1 year ago

My understanding is that the send coroutine when using FastAPI/Starlette should be safe to be called from different coroutines in parallel. Is this a wrong assumption?

I think this is true except when one of the coroutines is calling send with more_body=False to end the response. The problem this is trying to fix is that the _ping coroutine can wake up and call send while the stream_response coroutine is in the middle of sending. This is normally fine because only one coroutine can execute at once but breaks if the stream_response coroutine is closing the connection because then the _ping coroutine will try to send data on a closed connection.

I think a simple performance improvement would be to get rid of safe_send and just directly claim the lock in stream_response but only for the send call that has more_body=False. The _ping coroutine would also need to claim the lock but it would have far less contention.

sysid commented 1 year ago

@ejlangev , I am baffled by your cycle times, chapeau!