sublinks / sublinks-federation

Federation service for Sublinks
MIT License
11 stars 4 forks source link

Determine and implement a way to mark messages that need to be retried as well as retrying them (could be another queue in RabbitMQ or could be a DB table) #34

Open jgrim opened 10 months ago

devanbenz commented 10 months ago

https://www.rabbitmq.com/dlx.html Could probably use a dead letter queue for this. In other services that utilize a queue system I've always used SQS so I'm not super familiar with RabbitMQ but in SQS we generally have a dead letter queue for dead message processing.

lazyguru commented 10 months ago

The thing to keep in mind is that the messages that need to be retried could be for different servers. Some of those will succeed on retry as they were only temporarily unavailable. However, some will fail again and need to be retried multiple times. We will want to support exponential back off in our retries.

devanbenz commented 10 months ago

@lazyguru good point that makes sense. I'm thinking we have two options here:

  1. We implement a "delay queue" that has messages with a TTL on them so there is an exponential back-off and we consume them like normal messages https://blog.rabbitmq.com/posts/2015/04/scheduling-messages-with-rabbitmq/
  2. Have an in-application delay for messages. The caveat for this is that the function processing this message will be blocked while still working on it. (I think)

If we were to implement the delayed queue and a dead letter queue it may contain all this logic in a single area (rabbitmq) which would be pretty nice.

lazyguru commented 10 months ago

We already have to store the messages in a DB table to make them immutable and available via http requests. We can K.I.S.S in the beginning and just have a table to track what the last message_id was sent to which server (like Lemmy does). Having a DLQ won't work because a message might be successfully sent to serverA but not serverB or serverC on first attempt. Then later serverB comes back online and is reachable, but serverC never returns.