django / channels

Developer-friendly asynchrony for Django
https://channels.readthedocs.io
BSD 3-Clause "New" or "Revised" License
6.01k stars 793 forks source link

Document Django's async ORM API #2090

Open bigfootjon opened 2 months ago

bigfootjon commented 2 months ago

The async ORM interface was added in 4.1! Since channels supports >=4.2 we can document that new API.

Consider this a first pass, I just updated the spots I noticed that looked like they needed tweaking, happy to take feedback on wordsmithing or different ideas on how to document this.

Closes: #1999

bigfootjon commented 2 months ago

Maybe I should slightly tweak this PR then: introduce an async “clean_up_db_connections” decorator and then document that

carltongibson commented 2 months ago

@bigfootjon I'm not 100% sure what we should say here.

In theory you have one connection per thread, and in the pure async case you have just one thread.

So, why not hold onto the connection? (Likely you'd need a way to recover from connection errors, and such but as a first pass...)

Trouble is, we're still relying on the sync core underneath the async ORM API, so we're still going to be wanting to clean up connections. Without building some test projects I can't say exactly when and what is going to be needed there.

I think that's what makes it interesting 😜

bigfootjon commented 2 months ago

Makes sense. I’ll noodle on this for a bit and see what I can figure out (I only really have time to work on Django in my spare time, and I’ll be a bit limited on spare time for the next few weeks)

carltongibson commented 2 months ago

@bigfootjon Slowly, slowly is the Django way. No stress! 😉

bigfootjon commented 2 months ago

I've done a bit of tinkering and a bit of thinking and I think I've come up with a solution I'm happy with (though I'm eager to hear competing opinions).

Theory

Conceptually, the purpose of cleaning up connections is a necessary piece of the machinery of Django due to how connection health works within Django:

At the beginning of each request, Django closes the connection if it has reached its maximum age. If your database terminates idle connections after some time, you should set CONN_MAX_AGE to a lower value, so that Django doesn’t attempt to use a connection that has been terminated by the database server. (This problem may only affect very low traffic sites.)

ref

Based on my understanding, there are no "timers" within Django to periodically check connection health. Connection health is ensured at the start (and end) of a connection.

So for channels, I think we want to clear connections at the beginning of a request (connect), whenever a receive event happens before we dispatch to the consumer, and when the consumer finishes (disconnect). This provides the experience that users expect: if they use the ORM (and expect a working or fresh database connection) in a consumer it will either be at connect time or in response to new input from the client (receive) or the connection will be closing and they won't care.

The other time they would need a database connection would be at send time. Some event will be triggered on the server and will wake up some logic (channel layer or otherwise) that will result in a send. In this case, we should not be trying to auto-close connections because the user might be making a series of queries and send in sequence in response to some server-side event.

As a convoluted example, there might be changes to a server-side "file system" stored in Django Models where each Folder stores a ForeignKey ref to its parent folder. In response to some server-side event, the client would need updates on all folders in a path. As an implementation detail, the server might implement this as (pseudo-python):

folder = ...
while folder is not None:
  folder = await Folder.objects.aget(pk=folder.parent_id)
  await self.asend(...)

In such a configuration closing the connection after every aget/asend pair would kill performance and not be what the user expected. They look at this block of code and think "this is all happening within the same 'request'".

To summarize, connect and receive occupy the "request started" side of the conceptual request, but send does NOT align with the "request finished" event. Only disconnect aligns with "request finished". Therefore, I think we should close connections before connect and receive and after disconnect in a new Middleware (DjangoDBMiddleware) or implicitly within the framework (and remove the current database_sync_to_async stuff).

CONN_MAX_AGE

I think there's a problem with the solution outlined above: connections usually have a bounded lifetime.

If a connection runs for forever the database connection might hit its CONN_MAX_AGE. In these events we should close the connection and make a new one. This would be handled automatically in the above life-cycle events except send.

Imagine a scenario where a websocket is set up for client-side notifications of mutations to objects stored in the ORM. The developer has configured a consumer that does nothing on connect or receive, but every once in a while it wakes up (via an MQTT notification or something) and sends down the objects that it was notified about. Let's say there's some database query that must be done before sending down the object (like "is this content visible to the current user" checks) but these MQTT notifications are rare (1 hour or more). In this case, if the CONN_MAX_AGE on the database connection is less than an hour then we've violated the expectations of users by keeping their connections open longer than they specified.

The implicit user contract in the above docs is that a connection will not be used after CONN_MAX_AGE. (ok ok I suppose that the max age could be reached during a traditional django request, but since traditional http requests are so fast I'm ignoring that).

I don't really know how to solve this. I can see two solutions:

A. Close Connections After First Send

What if we close connections after the first send in a "while" (maybe by comparing the time the last send and the current send are happening)? This could be close enough to what users expect but could have some problems.

The biggest problem is that if the connection is invalid that wouldn't help users as the first query they make would fail and an exception would be raised, which goes against user expectations ("I thought Django managed this for me"). The next biggest problem is that it has bad performance characteristics. If an user has a consumer that sends 2 send messages immediately after one another and then pauses for a "while", this solution would slow down the second send but not the first as django would tear down the connection after the first send and then build a new one before the second send. That would again go against user expectations (and lead to a lot of head scratching).

I don't love this solution tbh.

B. Document It

I think in the send case we can't provide a good general-purpose solution implicitly for all users. Instead, we should provide the next best solution: documentation.

We can define a function (in channels or django core) that closes connections from async code and then document the contract that users need to uphold for themselves. The thesis of this documentation can be: call close_connection as early as possible in the code paths that eventually call send, particularly in long-lived connections.

Summary

I'm vastly in favor of (B) and the implicit cleanup in connect/receive/disconnect, and would be interested in putting in the work here to make it so.

carltongibson commented 2 months ago

Hey @bigfootjon -- thanks for putting your brain to it. Makes sense.

I too lean toward B. If I've got a very long lived connection, I'm happy that it's my job to run a periodic task to clean up the connection object. Makes sense to me that way.

bigfootjon commented 2 months ago

Alright I've made a pass at integrating my ideas into Channels and documenting them!

carltongibson commented 2 months ago

Hey @bigfootjon. I managed to get a look at this. Seems sound. I want to play with it a bit more but generally positive.

Thinking currently about the migration path here. Say I'm currently using the existing wrapper, I guess we'll want to say "migrate, but the extra calls to close old connections won't hurt"? 🤔

bigfootjon commented 2 months ago

I guess we'll want to say "migrate, but the extra calls to close old connections won't hurt"? 🤔

Yeah I think that’s reasonable. My reading of the close_old_connections code suggests it’s pretty cheap to call if it does nothing.

In any case, using the async versions of the ORM methods will be slightly faster in certain cases due to the ORM being able to sometimes hit caches before calling in to the sync DB internals, so there’s multiple incentives to use the newer method.

carltongibson commented 1 month ago

Hey @bigfootjon -- I'm thinking this is a point release, so just working my way round to that. Not forgotten 😉

carltongibson commented 1 month ago

Hey @bigfootjon — I'm looking at #1091 in relation to this... — problem there is that code wrapped in database_sync_to_async closes DB connections in test runs.

(Adam has a mocking work around, making it a no-op)

Fancy reading over that? Is it likely something exacerbated by what we're proposing here? It seems so... 🤔 I've had it in my notifications for a couple of weeks, but realise I should mention it to you, rather than sitting there not actually getting to look at it. 🤹

bigfootjon commented 1 month ago

Yeah I can take a look! (Sometime over the next few days)

carltongibson commented 1 month ago

No stress! 😅

I think we may need to tighten up the story there as part of this progress. Have a look, let me know what you think.

bigfootjon commented 1 month ago

Is it likely something exacerbated by what we're proposing here? It seems so... 🤔

Yeah I think I agree. I think #1091 will need to be resolved before we can move forward with this. My thinking is that since this PR introduces calls to close_old_connections in more code paths we'll be converting the current rare edge case into a much more common occurence.

I think the best path forward is https://code.djangoproject.com/ticket/30448 (if I understand the ticket correctly), but the channels_db.close_old_connections = no_op seems like a fine thing to use to short-circuit calling that function at all.

So I see the rough work ahead as (ordered):

  1. channels_db.close_old_connections = no_op (with a test case adapted from @adamchainz's minimal repro) a. https://code.djangoproject.com/ticket/30448 (nice to have)
  2. https://github.com/django/channels/pull/2090 (this issue)
  3. https://github.com/django/channels/pull/2092
  4. https://github.com/django/channels/pull/2093 (and any other async-related work)

And we can close:

  1. https://github.com/django/channels/issues/1091
  2. https://github.com/django/channels/pull/1290 (and maybe others)

Then we could cut a release called something like "4.2: Better Async Django Support" once Django 5.1 is released (which would be needed before releasing #2093 in case anything changes in that API at the last minute before release)

How does this plan sound?

carltongibson commented 1 month ago

Hey @bigfootjon — yes, that sounds about right!

(I'm head down lining up for DjangoCon Europe, but should have a bit more headspace after that. Do press on! 🎁)

bigfootjon commented 1 month ago

I've put up https://github.com/django/channels/pull/2101 to tackle (1). I don't really have the bandwidth to pick up the upstream django ticket right now, but I'd like to see it done eventually.

carltongibson commented 4 days ago

Hey @bigfootjon — right, I've come to the conclusion I'm just holding you up here with my lack of bandwidth to push this forwards. 🤹

As such I'm wondering if you'd accept joining as a Channels maintainer, so you can resolve issues, and we can push towards a release? I would be honoured to have you onboard.

bigfootjon commented 2 days ago

Hey @carltongibson, sorry for the slow reply (my American Independence Day celebrations ran a bit long). I'm definitely willing to become a maintainer, but some more info would be helpful to help me commit. I've emailed you at your GitHub account's public email address so we can discuss more directly and not dirty up this GitHub Issue with our discussion!