rebus-org / Rebus.MySql

:bus: MySQL integration for Rebus
https://mookid.dk/category/rebus
Other
7 stars 6 forks source link

Add support for automatic message replay if MySQL crashes #16

Closed kendallb closed 3 years ago

kendallb commented 3 years ago

Add support for automatically replaying non-leased messages after a default of 10 seconds if they handlers do not finish. Will fix the problem of MySQL crashing out and not allowing messages to be ACK'ed, so they will be replayed when it comes back up.


Rebus is MIT-licensed. The code submitted in this pull request needs to carry the MIT license too. By leaving this text in, I hereby acknowledge that the code submitted in the pull request has the MIT license and can be merged with the Rebus codebase.

mookid8000 commented 3 years ago

I'm wondering.... does it even make sense to have a non-leased transport with MySQL?

As far as I can tell, you're now approaching the exact same functionality as provided by the lease-based transport.... or am I missing something?

kendallb commented 3 years ago

Yeah I had the same thought myself. The leased transport layer does have more functionality on it and is more geared towards message handlers that need much longer timeouts before they are considered dead, as the default lease I think is 5 minutes. This is more like google pub/sub in that the default is 10s before it gets replayed.

So they are different but similar, yeah. Performance testing shows the extra date fields in the leased transport does make it slower. We could use the leased model with much shorter timeouts I suppose.

I was also wondering how long running handlers are supposed to work with rabbit or azure, as it was not clear if the message gets ACK’ed when the message is removed or when the handler is done. If it’s ACK’ed when the handler is done, then it would have problems with messages that take longer than the default ACK delay, assuming it’s similar to google pub/sub. I don’t see any leased transport layers for those, so wondering how it’s handled there?

mookid8000 commented 3 years ago

I was also wondering how long running handlers are supposed to work with rabbit or azure, as it was not clear if the message gets ACK’ed when the message is removed or when the handler is done. If it’s ACK’ed when the handler is done, then it would have problems with messages that take longer than the default ACK delay, assuming it’s similar to google pub/sub. I don’t see any leased transport layers for those, so wondering how it’s handled there?

With RabbitMQ, a message is received by the connection until it is either ACKed og NACKed. If it's neither ACKed or NACKed, then the message is left in the "unacked state", because it's still received at the transport level. It does become visible again though if the connection is dropped, so in cases where a consumer dies, the received-but-unacked messages will pop up in the queue again.

With Azure Service Bus, a built-in lease ensures that unacked messages become visible again (after 5 minutes I think). If message handling can take longer than that, Rebus has the ability to automatically renew peek locks in the background by going

services.AddRebus(configure => configure
    .Transport(t => t.UseAzureServiceBus(connectionString, "inputqueue")
        .AutomaticallyRenewPeekLock())
);

When this is done, Rebus will keep a list in the background of lease IDs and required renewal times, and then it will ensure that all leases are periodically renewed as necessary.

mookid8000 commented 3 years ago

oh sorry, didn't mean to close it 🙂

kendallb commented 3 years ago

I think you are right. We should ditch the non leased mode for MySQL. Should we kill off the non leased API, or kill off the leased API and rename it to as the default? Depends whether it should be clear it’s a lease model transport? Or maybe the default settings would be good for most use cases so it can be renamed as the only transport model?

mookid8000 commented 3 years ago

I think you should do what you think is easier.

And I think we should ditch the "lease-based" in the naming – using a lease-based protocol would just be how it works

kendallb commented 3 years ago

Ok, code is changed so the lease transport is the default and uses the original naming. Cleaned up some issues with configuring the leased transport when I moved it to be the main transport, and also added in code to upgrade the schema automatically if the fields are missing on startup. So this should be good to go now.

kendallb commented 3 years ago

Worked out a clean way to support message ordering via an ordering key header, so have added that to this PR as well.

kendallb commented 3 years ago

Changed to use a left join on the ordering keys, which is much faster when there are lots of messages in the table with the same ordering key.

mookid8000 commented 3 years ago

Thanks! 🙂 It's out now as Rebus.MySql 3.0.0-b5 -sorry for taking so long to process this 😐

kendallb commented 3 years ago

Awesome, thanks!