sensu / sensu-transport

The Sensu transport abstraction library.
MIT License
14 stars 19 forks source link

[Feature Req]: Make RabbitMQ auto_delete feature user configurable #36

Open syepes opened 8 years ago

syepes commented 8 years ago

As mentioned in the sensu-users group, the RabbitMQ auto_delete functionality (L82, L135) should be user configurable.

Pros:

Cons:

syepes commented 8 years ago

PR #37

warmfusion commented 8 years ago

I'd propose that it becomes possible to pass additional queue options such that queues can be configured with auto-delete: false but also x-expires: 60000 (ie; dont shutdown the queue, but keep messages around for 60 seconds before deleting them)

This allows sensu-servers to be restarted concurrently, without losing any events, but for longer periods of server downtime the queues are not filled.

syepes commented 8 years ago

@warmfusion I would rather leave the RabbitMQ specific stuff managed in RabbitMQ itself, so the config is as lean and generic as possible.

The x-expires can be handled by RabbitMQ using the policies: rabbitmqctl set_policy expiry "^(results$|keepalives$)" '{"expires":1800000}' --apply-to queues

Does anyone else have an opinion on this?

warmfusion commented 8 years ago

This is true - a RMQ Policy does make more sense to control this behaviour.

I've been trying to test this on my environment,but I'm trying to setup federated queues with HA on results/keepalives and its not playing ball - the queues just keep flapping around as the downstream clusters try and setup the queues on upstream which appear to then remove them (or something else is going on).

The RMQ Policy is the right approach - as it allows for modification of the queue expiry based on name and other features - allowing for clever tricks with subscriptions that are considered 'queueable for later'

warmfusion commented 8 years ago

Assuming that the policy can work given the problem i'm having with a federated cluster ( https://groups.google.com/forum/#!topic/rabbitmq-users/TcWl_xTd9Xg )

warmfusion commented 8 years ago

I applied some simple patches to sensu (Changed auto_delete to false) on the creation of any queues, stopped all the server and api processes, restarted and watched the queues get created, bound to the exchanges, but the sensu-api service didn't start properly (Just indicated a connection failure over and over)

It doesn't feel like a 'clean' change as I'd like to specify a queue TTL so that queues for subscriptions can expire gracefully, and the rabbitmq policy set to configure these values did not function across my federated environment.

So it seems right now that, for me at least,

  1. Sensu-Transport should set auto_delete: false
  2. Sensu-Transport should allow for setting the x-expires header so queues clean up automatically

Unless there is another option available?

syepes commented 8 years ago

@warmfusion, well that is strange I have fully tested out this patch using the following stack: (https://github.com/syepes/sensu-docker-stack) and it works.

Your issues must be related with the federation.

warmfusion commented 8 years ago

I'm fairly sure it does indeed exist as a quirk of the federation; queues are created automatically on upstream clusters using the same configuration options as the source - without auto-binding to the same exchanges.

At this time I've given up trying to get Sensu servers operating on federated RabbitMQ clusters using federated queues - federating exchanges results in event duplication but this can't be helped given how federated queues don't auto-bind to exchanges (meaning the results queue never get results from the result exchange the sensu-clients send to.)

Disappointing - but nothing Sensu can do about the lack of auto-binding queues on federation upstreams.

Auto-Delete configuration would still be a useful feature; it allows for buffering service outage and with a TTL policy on queues (or perhaps even messages) you'd avoid overloading queues.

syepes commented 8 years ago

@warmfusion Thanks for the details on the federation, I will try to add to this patch the possibility of handling the TTL policy on the queues.