Suggestion: a more robust implementation for the AUTOCOMMIT option & make it the default on Django ≥ 1.9

aaugustin commented 8 years ago

Django's on_commit hook seems perfectly appropriate for delaying enqueue() until the database transaction commits.

Django 1.9+ maintains its own per-database-connection (database connections are thread-local) queue of functions to run when the current database transaction commits. You don't have to roll your own.

This is better than hooking to request-related signals because it deals properly with partial rollbacks in case of nested atomic blocks, etc. It works regardless of whether the project uses ATOMIC_REQUESTS and also outside of the request-response cycle.

I think it would be good to make it the default. This would be technically backwards-incompatible but it's likely a better default in the long run, given that many web developers are uncomfortable with database transactions.

The only difficulty is the interaction with multi-database setups. You might have to add a kwarg to tell which database this task depends on. Setting that kwarg to None would enqueue the task immediately. The kwarg would default to 'default'.

Please let me know if you'd like me to elaborate on why I think this is the best way to hook with Django's transaction management.

(Although I'm not the author of on_commit — that's @carljm — I'm responsible for most of the current transaction management features in Django and I'm interested in helping third party projects make the best use of them.)

aaugustin commented 8 years ago

Here's a custom Queue class that implements the approach I described above.

from django.db import transaction
from rq.queue import Queue

DEFAULT = object()

class OnCommitQueue(Queue):
    """
    RQ Queue that enqueues jobs only after the database transaction commits.

    As a consequence, the enqueue function doesn't return the job.

    See also https://github.com/ui/django-rq/issues/187.

    """
    def __init__(self, *args, **kwargs):
        # Drop autocommit parameter -- this class takes a different approach
        # to the same issue.
        kwargs.pop('autocommit', None)
        return super().__init__(*args, **kwargs)

    def enqueue(self, *args, **kwargs):
        on_commit = kwargs.pop('on_commit', DEFAULT)

        # Explicitly passing on_commit=None forces enqueuing the task
        # immediately. This escape hatch should be rarely needed.
        if on_commit is None:
            super().enqueue(*args, **kwargs)

        # Not passing on_commit enqueues the task after the current
        # transaction on the default database commits.
        if on_commit is DEFAULT:
            on_commit = None

        # super() without arguments must be resolved within this function.
        enqueue = super().enqueue

        transaction.on_commit(
            lambda: enqueue(*args, **kwargs),
            using=on_commit,
        )

EDIT -- fixed code sample after testing.

aaugustin commented 8 years ago

Note that this approach makes it impossible to return the job synchronously, since it isn't created until the database transaction commits and won't be created if the transaction fails.

depaolim commented 8 years ago

I built a test case to show what @aaugustin described. I hope the comments in the code make it explicative so that the test cases can be used as a base for a correct on_commit hook implementation. Let me recall the @aaugustin worlds:

This is better than hooking to request-related signals because it deals properly with partial rollbacks in case of nested atomic blocks, etc. It works regardless of whether the project uses ATOMIC_REQUESTS and also outside of the request-response cycle.

selwin commented 8 years ago

Sorry for the late response. I think this is a good idea, though I'm unlikely to have any time to work on this anytime soon. I've been super busy since my daughter was born a few weeks ago.

If someone wants to work on this, I'll be more than happy to review and merge the resulting PR.

P.S: thanks for overhauling Django's transaction management features a few releases ago @aaugustin 😃

aaugustin commented 8 years ago

Congratulations :-)

The custom Queue class works just fine for now, no hurry!

depaolim commented 8 years ago

Congratulations :-)

selwin commented 8 years ago

Thanks 😁

Sent from my phone

On Sep 12, 2016, at 9:14 PM, Marco De Paoli notifications@github.com wrote:

Congratulations :-)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

rq / django-rq

Suggestion: a more robust implementation for the AUTOCOMMIT option & make it the default on Django ≥ 1.9 #187