Closed bobtiernay-okta closed 2 years ago
@bgreenlee Let me know if you agree with the change and I'll be happy to submit a PR. Cheers!
Hm, I just hit the problem where *.offset file contains new inode number but old (big) offset, for example:
# ls -li /var/log/maillog*
83922970 -rw-r----- 1 root logs 6889343 04-06 07:20 /var/log/maillog
84159524 -rw-r--r-- 1 root root 19 04-06 07:20 /var/log/maillog.offset
# maillog: crtime is when file was created on xfs filesystem
# xfs_db -r -c "inode 83922970" -c "p v3.crtime" /dev/md1
v3.crtime.sec = Fri Apr 6 05:02:03 2018
v3.crtime.nsec = 150325550
# maillog.offset: crtime as above
# xfs_db -r -c "inode 84159524" -c "p v3.crtime" /dev/md1
v3.crtime.sec = Thu Apr 5 09:38:04 2018
v3.crtime.nsec = 281641177
# So maillog.offset was created 5 Apr, pygtail processed maillog
# then few times (I'm running script that uses pygtail from cron)
# at night logrotate rotated maillog file (putting old one in archive/ subdirectory
# so pygtail cannot find it and handle it). New maillog file was created by syslog.
# script that uses pygtail was still started from cron (every 2 minutes)
# yet, offset file contains new inode number for maillog
# file (where maillog file was created on 6 Apr) BUT size is from
# old processing
# cat /var/log/maillog.offset
83922970
200558289
And now new run of pygtail processes nothing (because pygtail cannot handle such case where offset in *.offset is bigger than current file; related to new test case https://github.com/bgreenlee/pygtail/pull/42).
Bad news is that I'm using pygtail with your proposed change included, so bad inode is put into offset file or bad offset gets written.
I am running few python scripts under supervisord which are emitting STDOUT/STDERR
to separate log files with rotation (2 backups). I am using pygtail==0.7.0
to read newly written lines to these log files and emit to our internal metrics aggregator engine.
Recently, I met with a case with two of my deployments stopped seeing new lines. I checked the offset file, and came across below findings:
1st deployment showed:
*-stdout.log.offset
2383
104857608
2nd deployment showed:
*-stdout.log.offset
2320
104857607
Any clue what might be wrong here? I restarted my instances, and things worked like before.
_update_offset_file
currently uses the following assignment for inode calculation:However, this should be:
This covers the case in
next
when we are not at the end of a file:Without it, we may be processing
_rotated_logfile
or a renamed file and incorrectly associate the current inode.