Bogdanp / dramatiq

A fast and reliable background task processing library for Python 3.
https://dramatiq.io
GNU Lesser General Public License v3.0
4.27k stars 304 forks source link

Add an actor decorator argument to compute backoff time using a function #580

Open arseniiarsenii opened 12 months ago

arseniiarsenii commented 12 months ago

What version of Dramatiq are you using?

1.14.2

Description

Right now we can provide per-actor settings for the retry policy using a predicate in the actor decorator's argument retry_when. However, there is no such option to override backoff times used for retrying tasks.

The case: I am using an actor to send a webhook to an external service. We have agreed that a webhook should be retried three times if the first attempt fails. Second attempt after 1 minute, third attempt after 5 minutes, fourth attempt after an hour. It is not easy to implement such logic in dramatiq at the moment.

Having spent quite some time reading docs, issues, and sources I've come up with this workaround using CurrentMessage middleware:

@dramatiq.actor(queue_name="send_partner_webhook", max_retries=3)
def send_partner_webhook_actor(webhook: PartnerWebhook) -> None:
    try:
        send_partner_webhook(webhook)
    except PartnerWebhookRequestFailedError as e:
        message: dramatiq.Message = dramatiq.middleware.CurrentMessage.get_current_message()
        retries_so_far: int = message.options.get("retries", 0)
        backoff = {
            0: 60_000,  # backoff before first retry - 1 min
            1: 300_000,  # backoff before second retry - 5 min
            2: 3_600_000,  # backoff before third retry - 1 hour
        }
        if retries_so_far not in backoff:
            raise e
        raise dramatiq.errors.Retry(str(e), delay=backoff[retries_so_far])

I think it should be easier to achieve this behavior using an argument similar to retry_when:

def backoff_factory(retries_so_far: int) -> int:
    backoff = {
         0: 60_000,  # backoff before first retry - 1 min
         1: 300_000,  # backoff before second retry - 5 min
         2: 3_600_000,  # backoff before third retry - 1 hour
     }
    return backoff.get(retries_so_far, 3_600_000)

@dramatiq.actor(queue_name="send_partner_webhook", max_retries=3, backoff_factory=backoff_factory)
def send_partner_webhook_actor(webhook: PartnerWebhook) -> None:
    send_partner_webhook(webhook)

Thank you for your work, I hope that this feature request makes it into a future release!

spumer commented 12 months ago

I think it's also depends on Exception type. And passing (retries, exception) to backoff_factory will also consistent with retry_when which already handle this parameters

arseniiarsenii commented 12 months ago

I agree

bvidovic1 commented 7 months ago

Hey @arseniiarsenii I have a somewhat similar problem https://github.com/Bogdanp/dramatiq/issues/605 that implementation you suggested could resolve partially or fully.

Have you noticed in your tests that retry is performed more times than the specified limit?