python-diamond / Diamond

Diamond is a python daemon that collects system metrics and publishes them to Graphite (and others). It is capable of collecting cpu, memory, network, i/o, load and disk metrics. Additionally, it features an API for implementing custom collectors for gathering metrics from almost any source.
http://diamond.readthedocs.org/
MIT License
1.74k stars 601 forks source link

Opentsdb connections in hung state #514

Open josephfrancis opened 8 years ago

josephfrancis commented 8 years ago

When using diamond to send to opentsdb we have found the diamond process to occasionally stop sending any metrics to opentsdb. This does match up with the times when one or more of the opentsdb nodes attached to load balancer (AWS ELB) goes out of the cluster leaving the diamond socket in CLOSE_WAIT state. I have tried a local patch which adds TCP keepalive and reconnection interval to get around the issue. Happy to submit it as a patch here.

josegonzalez commented 7 years ago

Pull requests welcome!