papertrail / remote_syslog2

To install, see "Releases" tab. Self-contained daemon for reading local files and emitting remote syslog (without using local syslogd).
http://help.papertrailapp.com/
MIT License
637 stars 158 forks source link

remote_syslog keeps watching deleted files #212

Closed iprunache closed 6 years ago

iprunache commented 6 years ago

Hi, I'm using v0.19 to send logs from containers running on AWS ECS to Papertrail. Some of the apps in those containers are long running and create a couple of log files daily with randomized names.

Lately I've hit a problem where all inotify watches on the container instance are being eaten up by remote_syslog instances running in the containers. This prevents remote_syslog from watching any new log files. This is signaled with this error: ERROR remote_syslog.go:123 too many open files.

Debugging this I've found out that remote_syslog keeps watching deleted files which wastes inotify watches:

# ps aux | grep remote
   25 root       0:01 /usr/local/bin/remote_syslog --poll -D

# lsof -p 25
25  /usr/local/bin/remote_syslog    /var/www/site/storage/logs/test.log (deleted)
25  /usr/local/bin/remote_syslog    anon_inode:inotify
25  /usr/local/bin/remote_syslog    anon_inode:[eventpoll]
25  /usr/local/bin/remote_syslog    pipe:[67860474]
25  /usr/local/bin/remote_syslog    pipe:[67860474]
25  /usr/local/bin/remote_syslog    /var/www/site/storage/logs/test2.log (deleted)
25  /usr/local/bin/remote_syslog    anon_inode:inotify
25  /usr/local/bin/remote_syslog    anon_inode:[eventpoll]
25  /usr/local/bin/remote_syslog    pipe:[67867301]
25  /usr/local/bin/remote_syslog    pipe:[67867301]

I've even tried using the --poll flag but that doesn't seem to have an effect on the number of inotify watches being used.

remote_syslog_inotify_count.txt remote_syslog_lsof.txt remote_syslog_papertrail.txt

markdascher commented 6 years ago

That's an excellent description of a bug with v0.19, where it sometimes hangs onto files after they've been deleted. The good news is we've fixed this in v0.20, which was just released moments ago! Mind giving that a shot and letting us know if the lsof output improves?

iprunache commented 6 years ago

Heh, I missed that release by a couple of hours. Great news with v0.20! I will definitely give it a check.

iprunache commented 6 years ago

That seems to do the job! v0.20 stops watching deleted files. Thanks!

Regarding the --poll flag, any idea why it has no effect? Didn't have much time to look over the code but at first sight it looks like it's not implemented in remote_syslog.

markdascher commented 6 years ago

You're correct that the current version no longer behaves any differently with the --poll flag. The default behavior could be described as a hybrid that uses filesystem notifications by default, but incorporates polling when needed. The jury's still out on whether that'll work for everyone, or if we'll still need to bring explicit polling back for some edge cases. (Hence the current indecision on pulling it out of the documentation.)

iprunache commented 6 years ago

Thanks for the explanation!