mozilla-services / heka

DEPRECATED: Data collection and processing made easy.
http://hekad.readthedocs.org/
Other
3.4k stars 531 forks source link

Socket File Descriptor Leak #1927

Open cspenceiv opened 8 years ago

cspenceiv commented 8 years ago

I am running into an issue on multiple systems using heka as a log collector. Heka is using the TCP Input plugin with TLS to receive rsyslog forwarded logs.

By what I was able to gather from the logs and the state of the daemon on multiple servers, if any connections are dropped, the file descriptor for the socket is left active as shown in /proc/${PID}/fd and from ss -ant output (all connections were left "established" days after the clients had dropped). On any servers where enough connections dropped to consume all allowed file descriptors (ulimit soft limit of 1024), heka would begin to spam /var/log/heka.log with "too many open files" errors until the /var/log/ partition ran out of space (in our case, roughly 100GB could be filled in 4 hours).

Restarting the heka daemon releases all of the socket file descriptors.

This is an issue that likely overlaps with #1518.