shiftgig / celery-unique

Python factory for limiting Celery tasks by configuration
Apache License 2.0
7 stars 2 forks source link

Possible race condition when enqueueing the same task twice #7

Open WhyNotHugo opened 7 years ago

WhyNotHugo commented 7 years ago

If the same task+args is queued twice concurrently, there might be a race condition, since the task if first queued, and then added to the backend (redis, for now).

Events can happen in this order:

A better solution would be to slightly re-write apply_async something like (note: this is pseudocode-ish):

key = get_key_for_task(task)
expire_if_extant(key)
reservation = reserve_key(key)  # Adds the key to redis with a placeholder task_id
if not reservation:
    return None
rv = super().apply_async()
insert_key(key, rv.task_id)  # Replaces the placeholder above with the real task_id
return rv
WhyNotHugo commented 7 years ago

After our discussion on slack, and wanting to guarantee last-come-persists, I think we can use redis-lock to have a blocking lock around our task creation block.

It's still possible, under some unique race conditions, that the task with the wrong ETA gets queued (though basically, it would mean that the attempt to queue them was made in the wrong order).