Open aaugustin opened 8 years ago
Here's a custom Queue
class that implements the approach I described above.
from django.db import transaction
from rq.queue import Queue
DEFAULT = object()
class OnCommitQueue(Queue):
"""
RQ Queue that enqueues jobs only after the database transaction commits.
As a consequence, the enqueue function doesn't return the job.
See also https://github.com/ui/django-rq/issues/187.
"""
def __init__(self, *args, **kwargs):
# Drop autocommit parameter -- this class takes a different approach
# to the same issue.
kwargs.pop('autocommit', None)
return super().__init__(*args, **kwargs)
def enqueue(self, *args, **kwargs):
on_commit = kwargs.pop('on_commit', DEFAULT)
# Explicitly passing on_commit=None forces enqueuing the task
# immediately. This escape hatch should be rarely needed.
if on_commit is None:
super().enqueue(*args, **kwargs)
# Not passing on_commit enqueues the task after the current
# transaction on the default database commits.
if on_commit is DEFAULT:
on_commit = None
# super() without arguments must be resolved within this function.
enqueue = super().enqueue
transaction.on_commit(
lambda: enqueue(*args, **kwargs),
using=on_commit,
)
EDIT -- fixed code sample after testing.
Note that this approach makes it impossible to return the job synchronously, since it isn't created until the database transaction commits and won't be created if the transaction fails.
I built a test case to show what @aaugustin described. I hope the comments in the code make it explicative so that the test cases can be used as a base for a correct on_commit hook implementation. Let me recall the @aaugustin worlds:
This is better than hooking to request-related signals because it deals properly with partial rollbacks in case of nested atomic blocks, etc. It works regardless of whether the project uses ATOMIC_REQUESTS and also outside of the request-response cycle.
Sorry for the late response. I think this is a good idea, though I'm unlikely to have any time to work on this anytime soon. I've been super busy since my daughter was born a few weeks ago.
If someone wants to work on this, I'll be more than happy to review and merge the resulting PR.
P.S: thanks for overhauling Django's transaction management features a few releases ago @aaugustin 😃
Congratulations :-)
The custom Queue
class works just fine for now, no hurry!
Congratulations :-)
Thanks 😁
Sent from my phone
On Sep 12, 2016, at 9:14 PM, Marco De Paoli notifications@github.com wrote:
Congratulations :-)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Django's on_commit hook seems perfectly appropriate for delaying
enqueue()
until the database transaction commits.Django 1.9+ maintains its own per-database-connection (database connections are thread-local) queue of functions to run when the current database transaction commits. You don't have to roll your own.
This is better than hooking to request-related signals because it deals properly with partial rollbacks in case of nested atomic blocks, etc. It works regardless of whether the project uses
ATOMIC_REQUESTS
and also outside of the request-response cycle.I think it would be good to make it the default. This would be technically backwards-incompatible but it's likely a better default in the long run, given that many web developers are uncomfortable with database transactions.
The only difficulty is the interaction with multi-database setups. You might have to add a kwarg to tell which database this task depends on. Setting that kwarg to
None
would enqueue the task immediately. The kwarg would default to'default'
.Please let me know if you'd like me to elaborate on why I think this is the best way to hook with Django's transaction management.
(Although I'm not the author of
on_commit
— that's @carljm — I'm responsible for most of the current transaction management features in Django and I'm interested in helping third party projects make the best use of them.)