Open arm4b opened 5 years ago
kombu
is supporting that configuration and we'll need to expose it in st2.conf
. However more testing with heartbeat enabled is required to make sure the way how st2 uses rmq client is correct.
In a quick dev/testing environment when MQ heartbeat was enabled via URI connection string (https://www.rabbitmq.com/uri-query-parameters.html) there are issues when components st2actionrunner
, st2scheduler
, st2workflowengine
, st2notfier
(which are all part of the common https://github.com/StackStorm/st2/tree/master/st2actions group) gets disconnected via heartbeat mechanism when they're in a running state, while were able to respond on heartbeat OK when they're in idle state. Some kind of concurrency/threading/pool starvation.
Sounds like if heartbeat is enabled and negotiated with server, we need a thread in each st2 component that would iterate all established connections to rabbitmq and call heartbeat_check()
method regularly on each.
Is there any progress on this? I encountered a problem, probably because of the heartbeat. Because there is no heartbeat between kombu and st2, it may cause residual information on the mq server side, leading to errors about "ResourceLocked: Queue.declare: (405) RESOURCE_LOCKED - cannot obtain exclusive access to locked queue 'st2.trigger.watch.St2Timer-ddeda79de6' in vhost '/'. It could be originally declared on another connection or the exclusive property value does not match that of the original declaration."
I’m not aware of anyone working on this.
If anyone is interested to dig into this problem and help with the implementation, - code contributions are welcome!
Experiencing MQ issues with ST2 HA env where st2 services can be rescheduled/restarted/killed on a random basis which is normal in K8s. For example after some time RMQ eventually can report more then expected consumers for queues, that's undesired.
This probably means issues with st2 code and how clients are interacting with the message bus. It's not clear if duplicated clients are in some zombie-state and actually consuming/wasting incoming messages or not.
For better work with MQ and monitoring connections, add support for RabbitMQ
heartbeats
setting inst2.conf
. This way, clients failing to reply on heartbeat within set interval is disconnected by the server forcefully.See https://www.rabbitmq.com/heartbeats.html
This will improve overall ST2 HA capabilities.