Esri / activemq-for-geoevent

ArcGIS GeoEvent Server sample ActiveMQ connectors for connecting to ActiveMQ message servers.
Apache License 2.0
6 stars 6 forks source link

Enhanced support for failover transport #1

Closed brudo closed 8 years ago

brudo commented 9 years ago

When working with the failover transport for automated recovery with ActiveMQ, I found that if no broker was available when the input was first started, it would get stuck in a STARTING state and would be unresponsive e.g. could not be stopped or reconfigured, remaining in that state until it finally established a connection.

The issue seemed to be that connection.start() was being called from setup() before the transport thread was created. This can block indefinitely, e.g. if no broker is available when the input is first starting up.

In this change, by only initializing the connection to a ready-to-start state in setup() but leaving it for the transport thread to call start() and create the session etc. from inside run(), the problem seems to be avoided and the input remains manageable / stoppable even before it reaches the STARTED state.

I also added Manager-visible log messages and status indicators to report on failover events. These were not directly needed to address the issue noted above, but proved useful for monitoring in Manager. The same commit also includes an unrelated change to username / password handling, not specific to failover - sorry about that.

I don't know if it occurred prior to these changes, but I noticed that occasionally a message would appear after I started an input, saying that an error occurred. I would then find the input was successfully started after all and was receiving messages. I'm not sure what is causing that. At any rate, no other run-time issues were noted.

brudo commented 9 years ago

I just reversed part of the proposed change, so that running state is no longer set to STARTING while retrying during an outage. That seemed handy to provide a status indicator, but somehow it interfered with recovery once the outage was over! Although the connection was reestablished, messages were no longer actively consumed from the queue or topic.