Particular / docs.particular.net

All content for ParticularDocs
https://docs.particular.net
Other
103 stars 301 forks source link

Users might not be aware of the need to backup broker storage #5795

Open mauroservienti opened 6 years ago

mauroservienti commented 6 years ago

With more and more transports supporting native timeouts we’re exposing on premise customers to message loss if they don’t regularly perform brokers’ storage backups.

Right now the only non-cloud native transport that supports native delayed deliveries is RabbitMQ, SQL Server might be the next one. When using the timeout manager timeouts are stored in a storage, tipically a user managed database that operations people own and regularly backup. With native timeouts a broker outage can cause delayed messages to be lost, as they are in flight messages.

It’s probably good enough to provide some guidance/documentation on the topic.

andreasohlund commented 6 years ago

SQL Server might be the next one

SQLT already have native delayed delivery - https://docs.particular.net/transports/sql/native-delayed-delivery

SzymonPobiega commented 6 years ago

To be precise, if they don't do backups they are already exposed to message loss. What native timeouts do is they make the problem worse because the number of messages stored by the broker at any given point in time is bigger. This not only affects the backups but also regular storage i.e. when installing RabbitMQ customers need to take into account storage space consumed by delayed messages.

Also, SQL transport starting from version 3.1 supports native delayed delivery, but I guess SQL Server is a bit different thing as it usually has a back up plan.

I guess one of the solutions could be storing the timeouts in the saga store as proposed by @andreasohlund and myself. Then in the case of broker data loss we would be able to re-generate the timeout messages (treated now only as alerts) based on the state of the saga data.

lailabougria commented 3 years ago

Personally, I'm all for:

I guess one of the solutions could be storing the timeouts in the saga store as proposed by @andreasohlund and myself. Then in the case of broker data loss we would be able to re-generate the timeout messages (treated now only as alerts) based on the state of the saga data.

Which would then also unlock the ability to reschedule or cancel timeouts

@mauroservienti Do you agree? Should we reframe the issue to that end?

mauroservienti commented 3 years ago

Do you agree? Should we reframe the issue to that end?

No, I think this should just be documentation and guidance. A new issue could be framed to do what Szymon and Andreas propose, which I'm all in. If we were to reframe this issue we would never create documentation and guidance for all the existing customers that won't be using the new feature. This can be probably easily tackled by a documentation update.