jcabotc / hare

Some AMQP abstractions
8 stars 5 forks source link

Improve reconnection mechanism #13

Closed take-five closed 6 years ago

take-five commented 7 years ago

Previously, hare wasn't able to gracefully handle RabbitMQ connection disruption. For example, Hare.Consumer called Hare.Core.Conn.open_channel/2 on connection disruption and if connection couldn't be established in 5 seconds, consumer process would exit abnormally because of GenServer.call timeout. Connection failures should be expected and application should react on them properly.

This pull request changes the way Hare.Actor opens a channel on reconnection. By default Hare.Actor has status :not_connected. After initialization it opens a channel synchronously, like it was before, and sets its internal status to :connected. When connection is disrupted, Hare.Actor process sets its internal status to :not_connected, invokes disconnected/2 callback and requests channel from RabbitMQ connection asynchronously. The process is able to handle other messages while channel is being opened (it can take minutes or even hours during outages) which allows to avoid cascade GenServer timeouts. After channel is opened, Hare.Actor sets its internal status to :connected and invokes connected/2 callback.

Hare.Publisher, Hare.Consumer, Hare.RPC.Client and Hare.RPC.Server react on connected/disconnected callbacks and change their internal state accordingly. For example, when user tries to make a request via disconnected Hare.RPC.Client, he will receive {:error, :not_connected} response immediately instead of crashing calling process because of timeout.

take-five commented 7 years ago

@jcabotc I removed graceful shutdown feature from this PR

archseer commented 7 years ago

LGTM from me, will probably pull the PR into our codebase and give it a bit of testing 👍

take-five commented 7 years ago

@archSeer we started maintaining our own fork https://github.com/salemove/hare

archseer commented 7 years ago

@take-five ah nice, I'd move to that, but I still need this fix https://github.com/archSeer/hare/commit/c6e622f1e68042ab88ff21fb29ab9c749983a0f2 explained the reasoning for that Process.trap here.