ganglia / monitor-core

Ganglia Monitoring core
BSD 3-Clause "New" or "Revised" License
490 stars 246 forks source link

segmentation fault in Ganglia 3.7.2 #293

Open ysagon opened 6 years ago

ysagon commented 6 years ago

I'm using Ganglia 3.7.2 with a cluster of around 200 nodes.

Segmentation fault is occurring on gmetad.

gmetad[12911]: segfault at 0 ip 00002b94630b05dc sp 00002b9468741bf0 error 4 in libganglia.so.0.0.0[2b94630a5000+14000]

According to the debug log, it's crashing here:

Writing Root Summary data for metric mem_shared
Writing Root Summary data for metric proc_run
Cleanup thread running...
Cleanup deleting host "node016.cluster"
Cleanup deleting host "node016.cluster"
Segmentation fault

It seems it's always occurring on a node that was available and is not anymore (or a change in the node state like re installation).