jonathan-s / django-sockpuppet

Build reactive applications with the django tooling you already know and love.
https://github.com/jonathan-s/django-sockpuppet
MIT License
450 stars 22 forks source link

Make reflex rendering thread safe #99

Open blubber opened 3 years ago

blubber commented 3 years ago

The monkey patch used to render the original view's context was not thread safe. This commit fixes the issue by patching all the get_context_data methods before accepting requests, instead of while rendering the reflex' response.

blubber commented 3 years ago

@jonathan-s this is the solution that I suggested you over email.

While working on this I noticed that the view and its context are rendered twice during the page_render call. This is quite inefficient, because it could potentially result in many duplicate database queries in the view. The page rendering logic should be streamlined further probably.

jonathan-s commented 3 years ago

Hi @blubber, Thanks for taking a look at this. However, with the current code it seems like you end up in a recursive call. You can see it happen in the integration tests. I've reproduced it locally as well. I sent you another email, if you are replying to that, we can keep the discussion in this PR.

jonathan-s commented 3 years ago

While working on this I noticed that the view and its context are rendered twice during the page_render call.

There are some cases where this is necessary. If you're using a ListView you need to execute the get request as some instance variables are set in the get part.

However if you keep all your queries in get_context_data the queries should only be triggered once as they are cached in the reflex class.

blubber commented 3 years ago

Hi,

Threading issues are always hard to reproduce or even prove. So I did some digging into Channel's code, and found this: https://github.com/django/channels/blob/ece488b31b4e20a55e52948f21622da3e38223cb/channels/consumer.py#L104

It seems that your consumer ultimately inherits from a consumer that has _sync set to True, and it's dispatch method is not awaitable. This results in the code that's doing the dispatch running the handler in async_to_sync in a thread pool (as the comment also says). So regardless of being able to produce an actual problem, the method swapping is not thread safe and the code is running in a thread. If you deploy the current version in any kind of production load it will probably break and possible leak information.

On Sat, Feb 27, 2021 at 3:53 PM Jonathan Sundqvist notifications@github.com wrote:

While working on this I noticed that the view and its context are rendered twice during the page_render call.

There are some cases where this is necessary. If you're using a ListView you need to execute the get request as some instance variables are set in the get part.

However if you keep all your queries in get_context_data the queries should only be triggered once as they are cached in the reflex class.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jonathan-s/django-sockpuppet/pull/99#issuecomment-787084120, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABJ3ZUQC2GVH3EHDOPS72DTBEBPLANCNFSM4YJ6X77Q .

--

blubber commented 3 years ago

Unfortunately I haven't had time to look into this PR. However, I did take some time today to get to the bottom of the blocking behavior. I was a little weird to me that views or consumers where actually blocking, because the websocket consumer's dispatch method is wrapper in asgiref's sync_to_async which should execute it in a thread pool. This was bothering me, since I really couldn't explain the blocking behavior, but I now found out where my brain errored.

The sync_to_async class has an argument thread_sensitive which defaults to True, which does exactly the opposite of what I expect. If True it in facct forces execution in the main thread. Setting it to False results in the expected, non-blocking, behavior.

I believe that if you wrap a view class used with sockpuppet in sync_to_async(thread_sensitive=True) that it in fact will break on the get_context_data swapping. But I haven't had time to try this out yet.

blubber commented 3 years ago

Ok, so I finally managed to pin this down. It seems that asgiref 3.3 has changed sync_to_async's behavior to run code in a singel thread (the main thread) by default, ie the thread_sensitive argument defaults to True now. This completely changes Channel's behavior as it's now running sync views and other sync code serially by default. This includes the JsonWebSocketCustomer's dispatch method, which is decorated with a derivative of sync_to_async.

If you remove asgiref>=3.3 and install asgiref<3.3 instead, the behavior reverts to its previous default of concurrency using a thread pool (for sync parts like the consumer). Again, I'm pretty sure you can break the get_context_data replacement if you're running asgiref < 3.3.

jonathan-s commented 3 years ago

Thank you @blubber for this in-depth investigation! Much appreciated. So in other words this would be an issue if you were running channels 2.4, and not be an issue if you were running channels >=3.0 (as channels installs asgiref >3.3 by default there).

blubber commented 3 years ago

Well, if you force asgiref <3.3, which I did locally, than you get the old behavior, which also implies a threading issue. Also, it's unclear at this point how Channels are going to manage this change. They might decide to force thread_sensitive=False, to explicitly re-enable the old behavior.

jonathan-s commented 3 years ago

Relates to https://github.com/django/channels/pull/1582 and https://github.com/django/channels/issues/1587 So it's worth keeping these in mind if anything will / should change here.

blubber commented 3 years ago

Yeah, I was actually kind of waiting to see where they are going with this. That PR you've linked doesn't really solve the entire problem and seems to be stranded. Probably a good idea to wait and see how they handle this.