untergeek / zabbix-grab-bag

This is a collection of miscellaneous scripts for Zabbix data collection, maintenance, etc.
Other
107 stars 31 forks source link

No data from trapper items #10

Closed mdiorio closed 8 years ago

mdiorio commented 8 years ago

Hi,

If I can get this to work, this is the best use for Zabbix and Elastic monitoring. I get Zabbix agent results back for master_node and cluster health status, but none of the Zabbix Trapper items are returning any data.

In the es_stats_zabbix.log I'm seeing the following:

2016-03-28 11:21:22,202 DEBUG es_stats_zabbix batch:131 Batch mode with named batch: thirty_seconds 2016-03-28 11:21:22,203 DEBUG es_stats_zabbix batch:137 Batch config args: {'key3': 'health[initializing_shards]', 'key2': 'health[relocating_shards]', 'key1': 'health[unassigned_shards]', 'server': 'ghq-1pmonap01.globalspec.net', 'host': 'ghq-1delasticnode01.globalspec.net', 'key4': 'health[delayed_unassigned_shards]', 'port': '10051'} 2016-03-28 11:21:22,203 DEBUG es_stats_zabbix batch:143 Batch keys: {'key3': 'health[initializing_shards]', 'key2': 'health[relocating_shards]', 'key1': 'health[unassigned_shards]', 'key4': 'health[delayed_unassigned_shards]'} 2016-03-28 11:21:22,204 DEBUG es_stats_zabbix.utils parse_key:142 API: health Node: None Key: initializing_shards 2016-03-28 11:21:22,204 DEBUG es_stats_zabbix.utils parse_key:142 API: health Node: None Key: relocating_shards 2016-03-28 11:21:22,204 DEBUG es_stats_zabbix.utils parse_key:142 API: health Node: None Key: unassigned_shards 2016-03-28 11:21:22,204 DEBUG es_stats_zabbix.utils parse_key:142 API: health Node: None Key: delayed_unassigned_shards 2016-03-28 11:21:22,204 DEBUG es_stats_zabbix batch:150 API-separated keys: {'nodeinfo': [], 'clusterstate': [], 'health': [('health', None, 'initializing_shards'), ('health', None, 'relocating_shards'), ('health', None, 'unassigned_shards'), ('health', None, 'delayed_unassigned_shards')], 'nodestats': [], 'clusterstats': []} 2016-03-28 11:21:22,205 DEBUG urllib3.util.retry from_int:155 Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0) 2016-03-28 11:21:22,208 DEBUG urllib3.connectionpool _make_request:383 "GET /_cluster/health HTTP/1.1" 200 390 2016-03-28 11:21:22,208 INFO elasticsearch log_request_success:63 GET http://127.0.0.1:9200/_cluster/health [status:200 request:0.003s] 2016-03-28 11:21:22,209 DEBUG elasticsearch log_request_success:65 > None 2016-03-28 11:21:22,209 DEBUG elasticsearch log_request_success:66 < {"cluster_name":"artemis_dev","status":"yellow","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":62,"active_shards":62,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":62,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":50.0} 2016-03-28 11:21:22,210 DEBUG es_stats_zabbix batch:165 Metrics: [Metric('ghq-1delasticnode01.globalspec.net', 'health[initializing_shards]', 0), Metric('ghq-1delasticnode01.globalspec.net', 'health[relocating_shards]', 0), Metric('ghq-1delasticnode01.globalspec.net', 'health[unassigned_shards]', 62), Metric('ghq-1delasticnode01.globalspec.net', 'health[delayed_unassigned_shards]', 0)] **2016-03-28 11:21:22,214 DEBUG zbxsender send_to_zabbix:58 Got response from Zabbix: {u'info': u'processed: 0; failed: 4; total: 4; seconds spent: 0.000018', u'response': u'success'} 2016-03-28 11:21:22,214 INFO zbxsender send_to_zabbix:59 processed: 0; failed: 4; total: 4; seconds spent: 0.000018 **2016-03-28 11:21:22,214 DEBUG es_stats_zabbix batch:167 Result = True 2016-03-28 11:21:22,215 INFO es_stats_zabbix batch:170 Job completed.

In zabbix server log set to debug, I do see: 9185:20160328:112635.725 __zbx_zbx_setproctitle() title:'trapper #4 [processing data]' 9185:20160328:112635.725 trapper got '{ "request":"sender data", "data":[ { "host":"ghq-1delasticnode01.globalspec.net", "key":"health[initializing_shards]", "value":0, "clock":1459178782.36}, { "host":"ghq-1delasticnode01.globalspec.net", "key":"health[relocating_shards]", "value":0, "clock":1459178782.36}, { "host":"ghq-1delasticnode01.globalspec.net", "key":"health[unassigned_shards]", "value":62, "clock":1459178782.36}, { "host":"ghq-1delasticnode01.globalspec.net", "key":"health[delayed_unassigned_shards]", "value":0, "clock":1459178782.36}] }'

Which tells me that it's getting the data, but when I view the last data for the host, there is no data there.

Any ideas? I've only been using Zabbix for two weeks as I'm evaluating new monitoring solutions, and this is really my first major hurdle for it.

Thanks.

untergeek commented 8 years ago

Which versions of Elasticsearch and Zabbix are you using? I'll do some digging as soon as I can, which may not be immediately. :frowning:

mdiorio commented 8 years ago

Elastic 2.2.0, Zabbix 3.0.1 Thanks! I'm cross posting in the Zabbix forum and I see someone else just complained that they are not getting Trapper data in 3.0.1. Might not be you :)

mdiorio commented 8 years ago

After a bit more research - it may be with this script. I am able to successfully send messages via zabbix_sender and pulling the data in with a trapper. Not sure if the data isn't being sent back in a proper format or not.

mdiorio commented 8 years ago

I resolved at least part of my issue. Doing direct calls for nodestats was returning no data when doing:

# ./zabbix_get -s ghq-1delasticnode01.globalspec.net -p 10050 -k nodestats[node-1,process.open_file_descriptors]

The es_stats_zabbix.log file was saying no node specified, and looking at the logs, the following was showing up.
parse_key:142 API: nodestats Node: Key: node-1

I edited the es_stats_zabbix.userparm: UserParameter=nodestats[*],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/es_stats_zabbix.ini single nodestats[$1,$2]

This allows parsing both the node name, and key.

Still not sure about the batch processing at this point though.

untergeek commented 8 years ago

I fixed a bug that may be related to this, @mdiorio

Update es_stats_zabbix (it should force upgrade es_stats as well) via pip:

(sudo) pip install -U es_stats_zabbix

Let me know if this fixes the problem. Both "get" as a key/subkey, and hyphenated node names (or keys) were causing problems. Both have been fixed.

mdiorio commented 8 years ago

Looks like updating es_stats_zabbix blew up all my Elastic Monitoring

Received value [Traceback (most recent call last): File "/usr/bin/es_stats_zabbix", line 5, in from pkg_resources import load_entry_point File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in working_set.require(requires) File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require needed = self.resolve(parse_requirements(requirements)) File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve raise DistributionNotFound(req)pkg_resources.DistributionNotFound: elasticsearch>=1.6] is not suitable for value type [Numeric (unsigned)] and data type [Decimal]

mdiorio commented 8 years ago

[root@ bin]# pip show elasticsearch You are using pip version 7.1.0, however version 8.1.2 is available.

You should consider upgrading via the 'pip install --upgrade pip' command.

Metadata-Version: 2.0 Name: elasticsearch Version: 2.3.0 Summary: Python client for Elasticsearch Home-page: https://github.com/elastic/elasticsearch-py Author: Honza Král Author-email: honza.kral@gmail.com License: Apache License, Version 2.0 Location: /usr/lib/python2.6/site-packages Requires: urllib3

So elasticsearch is already >= 1.6.0, so why is es_stats_zabbix/pkg_resources.py complaining that it's not found?

untergeek commented 8 years ago

What distro are you using? Looks like RHEL or CentOS 6

mdiorio commented 8 years ago

RHEL 6.7

untergeek commented 8 years ago

The reason I ask is that I literally did nothing to the es_stats_zabbix code. I only changed the upstream dependencies, some docs, and copyright information.

untergeek commented 8 years ago

It seems likely that some python thing or other has happened.

untergeek commented 8 years ago

I just spun up a CentOS 6 image in Docker, and installed pip. This is what I did next:

# pip install es_stats_zabbix
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
You are using pip version 7.1.0, however version 8.1.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Collecting es-stats-zabbix
/usr/lib/python2.6/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading es_stats_zabbix-0.1.4.tar.gz
Collecting elasticsearch>=1.6.0 (from es-stats-zabbix)
  Downloading elasticsearch-2.3.0-py2.py3-none-any.whl (51kB)
    100% |████████████████████████████████| 53kB 847kB/s
Collecting es-stats>=0.2.1 (from es-stats-zabbix)
  Downloading es_stats-0.2.1.tar.gz
Collecting click>=3.3 (from es-stats-zabbix)
  Downloading click-6.6.tar.gz (283kB)
    100% |████████████████████████████████| 286kB 2.0MB/s
Collecting zbxsend>=0.1.6 (from es-stats-zabbix)
  Downloading zbxsend-0.1.6.tar.gz
Collecting kaptan>=0.5.8 (from es-stats-zabbix)
  Downloading kaptan-0.5.8.tar.gz
Collecting urllib3<2.0,>=1.8 (from elasticsearch>=1.6.0->es-stats-zabbix)
  Downloading urllib3-1.16-py2.py3-none-any.whl (98kB)
    100% |████████████████████████████████| 102kB 6.5MB/s
Collecting dotmap>=1.1.2 (from es-stats>=0.2.1->es-stats-zabbix)
  Downloading dotmap-1.1.16.tar.gz
Collecting PyYAML (from kaptan>=0.5.8->es-stats-zabbix)
  Downloading PyYAML-3.11.zip (371kB)
    100% |████████████████████████████████| 372kB 1.5MB/s
Installing collected packages: urllib3, elasticsearch, dotmap, es-stats, click, zbxsend, PyYAML, kaptan, es-stats-zabbix
  Running setup.py install for dotmap
  Running setup.py install for es-stats
  Running setup.py install for click
  Running setup.py install for zbxsend
  Running setup.py install for PyYAML
  Running setup.py install for kaptan
  Running setup.py install for es-stats-zabbix
Successfully installed PyYAML-3.11 click-6.6 dotmap-1.1.16 elasticsearch-2.3.0 es-stats-0.2.1 es-stats-zabbix-0.1.4 kaptan-0.5.8 urllib3-1.16 zbxsend-0.1.6
[root@11d14b7cea85 /]# es_stats_zabbix
Usage: es_stats_zabbix [OPTIONS] COMMAND [ARGS]...

  Get Elasticsearch stats from es_stats and send them to Zabbix

Options:
  --configuration TEXT  Path to configuration file
  --debug               Debug mode
  --version             Show the version and exit.
  --help                Show this message and exit.

Commands:
  batch   Batch mode
  single  Single item mode
untergeek commented 8 years ago

You might try uninstalling and reinstalling the es_stats, es_stats_zabbix, and elasticsearch python modules.

mdiorio commented 8 years ago

No luck with an uninstall and re-install. Everything was working before I did the pip install -U es_stats_zabbix

Removed the packages, cleared the cache dir, even told it not to use the cache dir.

pip install es_stats_zabbix --no-cache-dir

You are using pip version 7.1.0, however version 8.1.2 is available. You should consider upgrading via the 'pip install --upgrade pip' command. Collecting es-stats-zabbix /usr/lib/python2.6/site-packages/pip/vendor/requests/packages/urllib3/util/ssl.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning. InsecurePlatformWarning Downloading es_stats_zabbix-0.1.4.tar.gz Collecting elasticsearch>=1.6.0 (from es-stats-zabbix) Downloading elasticsearch-2.3.0-py2.py3-none-any.whl (51kB) 100% |████████████████████████████████| 53kB 1.0MB/s Collecting es-stats>=0.2.1 (from es-stats-zabbix) Downloading es_stats-0.2.1.tar.gz Requirement already satisfied (use --upgrade to upgrade): click>=3.3 in /usr/lib/python2.6/site-packages (from es-stats-zabbix) Requirement already satisfied (use --upgrade to upgrade): zbxsend>=0.1.6 in /usr/lib/python2.6/site-packages (from es-stats-zabbix) Requirement already satisfied (use --upgrade to upgrade): kaptan>=0.5.8 in /usr/lib/python2.6/site-packages (from es-stats-zabbix) Requirement already satisfied (use --upgrade to upgrade): urllib3<2.0,>=1.8 in /usr/lib/python2.6/site-packages (from elasticsearch>=1.6.0->es-stats-zabbix) Requirement already satisfied (use --upgrade to upgrade): dotmap>=1.1.2 in /usr/lib/python2.6/site-packages (from es-stats>=0.2.1->es-stats-zabbix) Requirement already satisfied (use --upgrade to upgrade): PyYAML in /usr/lib64/python2.6/site-packages (from kaptan>=0.5.8->es-stats-zabbix) Installing collected packages: elasticsearch, es-stats, es-stats-zabbix Running setup.py install for es-stats Running setup.py install for es-stats-zabbix Successfully installed elasticsearch-2.3.0 es-stats-0.2.1 es-stats-zabbix-0.1.4

[root@ghq-1delasticnode02 ~]# es_stats_zabbix Traceback (most recent call last): File "/usr/bin/es_stats_zabbix", line 5, in from pkg_resources import load_entry_point File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in working_set.require(requires) File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require needed = self.resolve(parse_requirements(requirements)) File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve raise DistributionNotFound(req) pkg_resources.DistributionNotFound: elasticsearch>=1.6.0

untergeek commented 8 years ago

What version of setuptools is on your machine? RHEL6 is notoriously bad about having a super outdated version there. The entry points issue is usually fixed by way of updating setuptools

mdiorio commented 8 years ago

setuptools (0.6rc11)

untergeek commented 8 years ago

Upgrade that, please. It's up to 0.23.1 right now.

mdiorio commented 8 years ago

wow. Just upgraded and es_stats_zabbix runs now. Ugh. Thanks.

untergeek commented 8 years ago

You're welcome! Sorry it was a painful upgrade.

mdiorio commented 8 years ago

No problem at all. Have to add a version check for setuptools :)

mdiorio commented 8 years ago

Hi Aaron,

Is there a logging level less than INFO for es_stats_zabbix, or a maximum log file size? It's absolutely blowing up our log path. 367MB in about 10 minutes.

We're collecting about 150 metrics very frequently currently during our development phase. I'm sure batch mode will help, and I'm going to test that out as soon as I get an opportunity.

Thanks.

Max

On Fri, Jun 24, 2016 at 4:28 PM, Aaron Mildenstein <notifications@github.com

wrote:

Closed #10 https://github.com/untergeek/zabbix-grab-bag/issues/10.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/untergeek/zabbix-grab-bag/issues/10#event-703668324, or mute the thread https://github.com/notifications/unsubscribe/AJkPA1uViuWHDbQCgBL8wRG91n0BXh6Wks5qPD3QgaJpZM4H57N9 .

untergeek commented 8 years ago

Whoa. I'll see if I can push an update for that. Meanwhile, see if logrotate can help you survive until then.

mdiorio commented 8 years ago

Looks like the bug you squashed isn't related. After re-enabling batch mode, I am still seeing the same errors in the Zabbix server logs:


 91309:20160628:160516.175 __zbx_zbx_setproctitle() title:'trapper #5 [processing data]'
 91309:20160628:160516.175 trapper got '{
    "request":"sender data",
    "data":[
        {
            "host":"ghq-1delasticnode01.globalspec.net",
            "key":"health[initializing_shards]",
            "value":0,
            "clock":1467144305.74},
        {
            "host":"ghq-1delasticnode01.globalspec.net",
            "key":"health[relocating_shards]",
            "value":0,
            "clock":1467144305.74},
        {
            "host":"ghq-1delasticnode01.globalspec.net",
            "key":"health[unassigned_shards]",
            "value":0,
            "clock":1467144305.74},
        {
            "host":"ghq-1delasticnode01.globalspec.net",
            "key":"health[delayed_unassigned_shards]",
            "value":0,
            "clock":1467144305.74},
        {
            "host":"ghq-1delasticnode01.globalspec.net",
            "key":"nodestats[process.open_file_descriptors]",
            "value":831,
            "clock":1467144305.74}]
}'
 91309:20160628:160516.175 In recv_agenthistory()
 91309:20160628:160516.175 In process_hist_data()
 91309:20160628:160516.175 End of process_hist_data():SUCCEED
 91309:20160628:160516.175 In zbx_send_response()
 91309:20160628:160516.175 zbx_send_response() '{"response":"success","info":"processed: 0; failed: 5; total: 5; seconds spent: 0.000021"}'
 91309:20160628:160516.175 End of zbx_send_response():SUCCEED
 91309:20160628:160516.175 End of recv_agenthistory()
 91309:20160628:160516.175 __zbx_zbx_setproctitle() title:'trapper #5 [processed data in 0.000242 sec, waiting for connection]'

And the same issue in the zabbix_es_log.log file

2016-06-28 15:57:44,144 INFO processed: 0; failed: 5; total: 5; seconds spent: 0.000021

I'll have to check in with Zabbix to see if they fixed any bugs on their end.

untergeek commented 8 years ago

That's odd. And the keys & host names match up properly?

untergeek commented 8 years ago

I will try to reproduce on my end.

mdiorio commented 8 years ago

Yes - this is all based on templated configurations. Host and keys do match.