randywallace / zabbix-cloudwatch

An external script for getting cloudwatch metrics into Zabbix
MIT License
27 stars 11 forks source link

zabbix item "Not supported" when zabbix-cloudwatch returns "" instead of a numeric value #1

Open paulruiz opened 10 years ago

paulruiz commented 10 years ago

AWS will return an empty datapoint array for a statistic when there's no data. AWS should send a datapoint = 0.0 but doesn't. If the metricname doesn't exist, AWS also returns a an empty datapoint array. zabbix-cloudwatch exits 1 due to NoMethodError: undefined method `[]' at lib/zabbix-cloudwatch.rb:100 puts ret[:datapoints][0][symbol]

At this point the variable "ret" looks like: {:datapoints=>[], :label=>"HTTPCode_ELB_5XX", :response_metadata=>{:request_id=>"e43b600a-c11f-11e3-ac5d-1b6c6fb5046d"}}

Example command: zabbix-cloudwatch --namespace AWS/ELB \ --dimension-name LoadBalancerName \ --aws-region us-west-2 \ --dimension-value elb-name \ --metricname HTTPCode_ELB_5XX \ --statistic Sum

I was going to send a pull request to check the length of ret[:datapoints] and print 0.0 if == 0 but datapoints [] is also empty if the metricname doesn't exist. Fucking AWS. I guess it's up to you to do anything with this or not but I figured it would be good to at least have it documented.

My workaround was to wrap zabbix-coudwatch in a shell script and echo ${VAL:-0.0} to avoid "Not supported" and crappy looking graphs in zabbix.

Thanks!

randywallace commented 9 years ago

I don't know if I can be much help on this now, I've since moved our entire infrastructure off of Zabbix to Sensu+InfluxDB (which IMHO is much easier to manage). If someone wants to submit a PR to do something about this, tho, I'll be more than happy to merge it in.

Future planning for getting cloudwatch metrics involve, at least on my dev part, writing a power-script that collects a whole bunch of cloudwatch metrics and throws them in InfluxDB for me in some sort of batch step. I've grown to thoroughly hate CloudWatch over the last couple years, but there are some things I can unfortunately only get from it. Otherwise, I'm going to be writing scripts that just hit the API's directly for things like SQS queue sizes and whatnot since CloudWatch is completely unreliable with most things.