jasonmcintosh / rabbitmq-zabbix

Zabbix RabbitMQ Configuration
Apache License 2.0
251 stars 168 forks source link

How to monitor queues with different length? #78

Closed AviPairupuni closed 6 years ago

AviPairupuni commented 6 years ago

Hi,

We monitor different queues that belongs to different customers and we need alerts at different queue lengths. (Some can be warning 1500 some other should be critical at 200) Is it possible to achieve my objective? Please help me

Thanks, Avinasha

jasonmcintosh commented 6 years ago

I'd need to play with this, but a few things come to mind...

Easiest IF IT WORKS

Note, I've NOT tried this but MIGHT work. You could disable the discovery triggers and create your own on specific queues. I THINK you can create a manual trigger on an auto discovered entry. Not sure if updates to the discovery would affect these manual triggers or re-enable disabled triggers. BUT this would be really easy to test.

Harder/detailed solution...

If that doesn't work, you're at doing more modifications to allow overrides per queue. A sample implementation... is below Feel free to work on this issue pull requests ;)

Basics: 1) Modify the list_queues.sh and auto-discover rule to set a trigger expression. Currently it's {Template App RabbitMQ v3:rabbitmq.queues[{#VHOSTNAME},queue_messages,{#QUEUENAME}].last(0)}>{$RABBIT_QUEUE_MESSAGES_CRIT} would likely need to be (note it's the # which changes this to a variable from discovery data vs. a variable defined at the node/global level). {Template App RabbitMQ v3:rabbitmq.queues[{#VHOSTNAME},queue_messages,{#QUEUENAME}].last(0)}>{#RABBIT_QUEUE_MESSAGES_CRIT}

2) You need to also pass back the amounts to alert on. See https://github.com/jasonmcintosh/rabbitmq-zabbix/blob/master/scripts/rabbitmq/api.py#L54 For setting "#RABBIT_QUEUE_MESSAGES_CRIT" - right now, you'd have to do it PER queue. You could create a lookup map for overrides (e.g. check map if it contains queue name, if so, get warn/crit values, otherwise return default). Store the map in the config file similar to filter values, allowing custom overrides per queue.

3) Repeat this for the other triggers (e.g. the warning ones).

Note, this is ONE possible method - not necessarily the right method.

AviPairupuni commented 6 years ago

Hi Jason,

Thank you for a very detailed answer. I would go with the harder solution. I will update if I succeed in doing this.

Regards, Avinasha

jasonmcintosh commented 6 years ago

NP - if I can offer advice, happy to do so :) I don't do a whole lot of monitoring anymore - very cloud/architecture focused these days vs. Zabbix :)

AviPairupuni commented 6 years ago

Jason,

Thank you for the useful advice you gave earlier. While I was fiddling with the scripts and trigger expressions, a colleague of mine solved it in a very simpler method. Here it is:: Steps at agent end : • Login on agent as root or with Sudo permissions, • Go to “/etc/zabbix/zabbix_agentd.d/zabbix-rabbitmq.conf” and add below line o UserParameter=rabbitmq._messages,sudo /usr/sbin/rabbitmqctl -n rabbit list_queues name messages | grep '' | head -1 | awk '{ print $2 }'
[If there are multiple queues with same name syntax, for ex. inet, inetout_error (inet is common syntanx)]

o UserParameter=rabbitmq._messages,sudo /usr/sbin/rabbitmqctl -n rabbit list_queues name messages | grep '' | awk '{ print $2 }' [For unique queue name]

[Update with the queue name, for example if the queue name is "reports"then: UserParameter=rabbitmq.reports_messages,sudo /usr/sbin/rabbitmqctl -n rabbit list_queues name messages | grep 'reports' | awk '{ print $2 }']

• Restart agent service using “service zabbix-agent restart”

Steps at template end : • Create item with item key “rabbitmq._messages”

• Create trigger based upon your customized thresholds

Regards, Avi

jasonmcintosh commented 6 years ago

lol that'd definitely work ;) The biggest disadvantage is that you're adding a lot of extra work with grep/awk statements - but at the end fo the day, that's relatively minor. You WILL increase the amount of work from a RabbitMQ side, but if you don't have huge numbers of queues this should also be minor.

SO yep, definitely a way to go about this :)