ganglia / gmond_python_modules

Repository of user-contributed Gmond Python DSO metric modules
http://sourceforge.net/apps/trac/ganglia/wiki/ganglia_gmond_python_modules
389 stars 354 forks source link

ganglia metrics disappear when ES cluster is busy #161

Open apple-corps opened 10 years ago

apple-corps commented 10 years ago

I'm not sure if this is specific only to the ES module, or pertains to gmond as a whole. When my elasticsearch cluster is on a deep search through a large set of data, the nodes themselves remain up. However elasticsearch cannot respond to its APIs anymore. However I noticed that the machines themselves appeared down in Ganglia. I could however ssh into the hosts and check load and whatnot. The hosts and their network interfaces weren't overwhelmed, and I assume that the way the gmond_python_modules and or the elasticsearch gmond_python_module are written, will cause gmond to hang. That is gmond will not return other metrics unless the es gmond_python_module can get the ES stats. Is this assumption correct? If so, any suggestions on how to go about fixing this?

vvuksan commented 9 years ago

This is certainly possible. Which elastic search metric module are you using ? It may not be threaded so it may be blocking.