untergeek / zabbix-grab-bag

This is a collection of miscellaneous scripts for Zabbix data collection, maintenance, etc.
Other
107 stars 31 forks source link

Batch mode problems #8

Closed 4erpak closed 8 years ago

4erpak commented 9 years ago

Houston we have a problem :-) When i disabled in userparameter file only this record:

UserParameter=es_stats_batch[*],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_stats_zabbix.ini batch --name $1

NO NEW(health,clusterstats,nodeinfo,nodestats) data appear in zabbix except zabbix-agent check "health[status]". I don't want monitor this batch commands, i lost whole day to figure out what's going on, because in logs i'm not seen any useful info:

2015-11-05 06:53:22,421 INFO elasticsearch log_request_success:63 GET http://127.0.0.1:9200/ [status:200 request:0.002s] 2015-11-05 06:53:22,421 DEBUG elasticsearch log_request_success:65 > None 2015-11-05 06:53:22,421 DEBUG elasticsearch log_request_success:66 < { "status" : 200, "name" : "logs", "version" : { "number" : "1.1.2", (Yeah we use so old elastic :) ) "build_hash" : "e511f7b28b77c4d99175905fac65bffbf4c80cf7", "build_timestamp" : "2014-05-22T12:27:39Z", "build_snapshot" : false, "lucene_version" : "4.7" }, "tagline" : "You Know, for Search" }

2015-11-05 06:53:22,421 DEBUG es_stats_zabbix.utils check_version:77 Detected Elasticsearch version 1.1.2 2015-11-05 06:53:22,422 INFO es_stats_zabbix single:114 Single key: health[status] 2015-11-05 06:53:22,422 DEBUG es_stats_zabbix.utils parse_key:142 API: health Node: None Key: status 2015-11-05 06:53:22,423 DEBUG urllib3.util.retry from_int:155 Converted retries value: False -> Retry(total=False, connect=None, read=None, redirect=0) 2015-11-05 06:53:22,424 DEBUG urllib3.connectionpool _make_request:385 "GET /_cluster/health HTTP/1.1" 200 223 Where is zbxsender? )) ^^

Userparameter output: UserParameter=health[_],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_statszabbix.ini single health[$1] UserParameter=clusterstats[],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_statszabbix.ini single clusterstats[$1] UserParameter=clusterstate[],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_statszabbix.ini single clusterstate[$1] UserParameter=nodestats[],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_stats_zabbix.ini single nodestats[$1] UserParameter=nodeinfo[*],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_stats_zabbix.ini single nodeinfo[$1]

UserParameter=es_stats_batch[*],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_stats_zabbix.ini batch --name $1

es_stats_ini output: [logging] ;debug setting is overridden by the command-line arg, if present debug = True loglevel = DEBUG

;logfile = /path/to/logfile logfile = /tmp/es_out.log

; can be default or logstash (JSON logging) logformat = default

But when batch mode enabled, only items which are in this sections[30,60sec and five minutes] working. Other items not working.

Userparameter output: UserParameter=health[_],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_statszabbix.ini single health[$1] UserParameter=clusterstats[],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_statszabbix.ini single clusterstats[$1] UserParameter=clusterstate[],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_statszabbix.ini single clusterstate[$1] UserParameter=nodestats[],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_statszabbix.ini single nodestats[$1] UserParameter=nodeinfo[],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_stats_zabbix.ini single nodeinfo[$1] UserParameter=es_statsbatch[],/usr/bin/es_stats_zabbix --configuration /etc/zabbix/scripts/es_stats_zabbix.ini batch --name $1

And es_ini output: [thirty_seconds] server = my_zabbix_server port = 10051 host = my_monitored_host

key1 = health[unassigned_shards] key2 = health[relocating_shards] key3 = health[initializing_shards] key4 = health[delayed_unassigned_shards]

[sixty_seconds] server = my_zabbix_server port = 10051 host = my_monitored_host

key1 = clusterstats[indices.docs.count] key2 = clusterstats[indices.fielddata.evictions] key3 = clusterstats[indices.fielddata.memory_size_in_bytes] key4 = clusterstats[indices.filter_cache.evictions] key5 = clusterstats[indices.filter_cache.memory_size_in_bytes]

[five_minutes] server = my_zabbix_server port = 10051 host = my_monitored_host

key1 = health[number_of_nodes] key2 = health[active_primary_shards] key3 = health[active_shards] key4 = health[number_of_data_nodes] key5 = clusterstats[indices.store.size_in_bytes] key6 = clusterstats[indices.count]

When these batches commented nothing working again: ;server = my_zabbix_server ;port = 10051 ;host = my_monitored_host

logs output: cli:70 Job starting. 2015-11-05 07:29:54,449 DEBUG es_stats_zabbix cli:71 Logging config args: {'debug': 'False', 'logformat': 'default', 'logfile': '/tmp/es_out.log', 'loglevel': 'DEBUG'} 2015-11-05 07:29:54,449 DEBUG es_stats_zabbix cli:71 Logging config args: {'debug': 'False', 'logformat': 'default', 'logfile': '/tmp/es_out.log', 'loglevel': 'DEBUG'} 2015-11-05 07:29:54,449 DEBUG es_stats_zabbix cli:74 Elasticsearch config args: {'http_auth': None, 'certificate': None, 'host': '127.0.0.1', 'timeout': 10, 'use_ssl': False, 'master_only': False, 'port': 9200, 'ssl_no_validate': False} 2015-11-05 07:29:54,449 DEBUG es_stats_zabbix.utils get_client:99 kwargs = {'http_auth': None, 'certificate': None, 'host': '127.0.0.1', 'timeout': 10, 'use_ssl': False, 'master_only': False, 'port': 9200, 'ssl_no_validate': False} 2015-11-05 07:29:54,450 INFO es_stats_zabbix.utils get_client:100 Initializing Elasticsearch client. 2015-11-05 07:29:54,453 DEBUG es_stats_zabbix.utils check_version:77 Detected Elasticsearch version 1.1.2 2015-11-05 07:29:54,453 DEBUG es_stats_zabbix batch:128 Batch mode with named batch: thirty_seconds [the same for 60 sec and 5min] 2015-11-05 07:29:54,453 DEBUG es_stats_zabbix batch:134 Batch config args: {'key3': 'health[initializing_shards]', 'key1': 'health[unassigned_shards]', 'key2': 'health[relocating_shards]', 'key4': 'health[delayed_unassigned_shards]'} 2015-11-05 07:30:02,401 INFO es_stats_zabbix cli:70 Job starting.

Where is another metrics: health.clusterstats? Seems to be this section not working: [batch] ; Zabbix server address server = my_zabbix_server_address

; Zabbix server port port = 10051

; Zabbix host (where the items will go) host = my_monitored_host_address

; Keys can be of any label other than the above key1 = health[status] key2 = health[number_of_nodes] key3 = health[unassigned_shards] key4 = health[active_primary_shards] key5 = health[relocating_shards] key6 = health[active_shards] key7 = health[initializing_shards] key8 = health[number_of_data_nodes] key9 = health[delayed_unassigned_shards]

key10 = clusterstats[indices.store.size_in_bytes] key11 = clusterstats[indices.count] key12 = clusterstats[indices.docs.count] key13 = clusterstats[indices.fielddata.evictions] key14 = clusterstats[indices.fielddata.memory_size_in_bytes] key15 = clusterstats[indices.filter_cache.evictions] key16 = clusterstats[indices.filter_cache.memory_size_in_bytes]

key17 = clusterstate[master_node]

key18 = nodeinfo[logs,jvm.mem.heap_init_in_bytes] key19 = nodeinfo[logs,process.max_file_descriptors]

key20 = nodestats[logs,process.open_file_descriptors] key21 = nodestats[logs,jvm.mem.heap_max_in_bytes] key22 = nodestats[logs,jvm.mem.heap_used_in_bytes] key23 = nodestats[logs,jvm.mem.heap_used_percent] key24 = nodestats[logs,jvm.gc.collectors.old.collection_count] key25 = nodestats[logs,jvm.gc.collectors.old.collection_time_in_millis]

untergeek commented 9 years ago

If you look at the template, only the health[status] and clusterstate[masternode] keys are zabbix agent items. Everything else is a zabbix trapper. These two items do NOT use the zbxsender as they are agent items and just return their value at the command-line for the UserParameter.

You will have to build/change the template to make each of these items zabbix agent items, if that's the way you want to go. It should be a simple change. Of course, you'll also want to disable or delete the es_zabbix_batch items, too.

The reason I went with a batch mode is that it's fewer calls to Elasticsearch. It's actually more performant than a separate call for each statistic. If you want to keep the benefits of batch mode, but return fewer keys, you can comment them out in both the ini file and the template.