Dear vvuksan,
recently I try to deploy ganglia in my production env,but something went wrong ...
env:
selinux:disabled
firewall: disabled
os: redhat6.6 x64
transfermethod: unicast
ganglia pkgs(from epel):
ganglia-gmetad-3.7.1-1.el6.x86_64
ganglia-web-3.6.2-1.el6.x86_64
ganglia-gmond-3.7.1-1.el6.x86_64
ganglia-3.7.1-1.el6.x86_64
hosts:
X86-Manage1:
ip: 172.20.31.131
components:gmetad,gmond,gweb
gridname: DP_Ganglia
ds:
data_source "DP_Ganglia" 10 X86-Manage2 # a down level cluster's gmeted
X86-Manage2:
ip: 172.20.31.132
components:gmetad,gmond
gridname: DP_Ganglia
cluster:NewBilling
ds:
data_source "NewBilling" 10 X86-Manage2 #self
PMC_WEB_SRV3/4:
ip: 172.20.31.35/36
components:gmond
cluster:NewBilling
Topography:
X86-Manage1 : is the topmost gmetad with gweb,collect X86-Manage2's gmetad data.
X86-Manage2 : cluster 'NewBilling' gmetad node,take care of cluster 'NewBilling' 's all gmond data
PMC_WEB_SRV3/4: hosts to be monitor
configuration sample:
PMC_WEB_SRV3/4 : gmond.conf
cluster {
name = "NewBilling"
}
udp_send_channel {
host = X86-Manage2
port = 8649
ttl = 1
}
udp_recv_channel {
port = 8649
}
tcp_accept_channel {
port = 8649
gzip_output = no
}
all gmetad.conf are configured as hosts definitions
probleam:
everything works pretty well while all gxxd running,if I shutdown host PMC_WEB_SRV3's gmond daemon (service gmond stop),
on X86-Manage2 gstat shows
CLUSTER INFORMATION
Name: NewBilling
Hosts: 2
Gexec Hosts: 0
Dead Hosts: 1
Localtime: Fri Jun 26 16:40:12 2015
There are no hosts running gexec at this time
gmetad detected the lost connection with the host's gmond I just killed
OK,let's visit gweb GUI,no matter which host I select,non of them shows Graph
when PMC_WEB_SRV3's gmond recover, gweb become normal again.
note: during PMC_WEB_SRV3's gmond stop, all metrics from other hosts in cluster NewBilling are revieved by X86-Manage2 .
pls help to analyze what cause this symptom,a bug? or some where I configured wrong,million tks!
Dear vvuksan, recently I try to deploy ganglia in my production env,but something went wrong ...
configuration sample: PMC_WEB_SRV3/4 : gmond.conf
all gmetad.conf are configured as hosts definitions
probleam: everything works pretty well while all gxxd running,if I shutdown host PMC_WEB_SRV3's gmond daemon (service gmond stop), on X86-Manage2 gstat shows
gmetad detected the lost connection with the host's gmond I just killed OK,let's visit gweb GUI,no matter which host I select,non of them shows Graph
when PMC_WEB_SRV3's gmond recover, gweb become normal again.
note: during PMC_WEB_SRV3's gmond stop, all metrics from other hosts in cluster NewBilling are revieved by X86-Manage2 .
pls help to analyze what cause this symptom,a bug? or some where I configured wrong,million tks!