Closed nalind closed 6 years ago
Integration tests are failing because the build uses golang 1.6.4, but the "vendored" archive/tar
package references constant io.SeekCurrent
, which wasn't added until go 1.7. According to the docs, replacing it with os.SEEK_CURRENT should work.
See also: https://github.com/systemd/systemd/issues/7998, which highlights that a journalctl -f
appears to use an internal mechanism to close open file descriptors on rotation instead of the proper sd_journal_close()/sd_journal_open()
dance.
This gist appears to be a working example of how to tail the journal without leaking file descriptors or creating duplicates: https://gist.github.com/portante/ff7fb429c6f973aab377f7bb77b0ffdb
Second patch added to check that we're not filling up the logWatcher.Msg
channel, which would cause the writing goroutine to block indefinitely, making it so that it never closes its handle. This can happen when the client starts docker logs -f
and then closes its connection.
The pathological test case is to disable rate limiting by setting RateLimitBurst
to 0
and SystemMaxFileSize
to 10M
in /etc/systemd/journald.conf
, ensuring that a log-heavy container will cause rapid and frequent creation of new files to hold journal entries. Starting a container to run seq 10000000
will then produce a test case which should help to reproduce the problems we're attempting to solve.
Based on advice from other maintainers, it's actually looking like we don't need to handle SD_JOURNAL_INVALIDATE in order to close handles to deleted journal files, as sd_journal_process()
does that automatically. In that case, they'd only be held open while we're reading them in between calls to sd_journal_process()
, which can be a while if there's a lot of logged output for us to churn through when we're reading the logs in either non-follow mode, or during the "catch up" phase in follow mode. So the first patch in the series may not be providing any benefit here.
@portante It shouldn't be necessary if sd_journal_process()
closes only the ones that are gone and only opens the new ones, and it should be a little bit faster to avoid reopening in INVALIDATE cases, yeah.
FWIW, here is what we proposed for rsyslog client: https://github.com/rsyslog/rsyslog/pull/2437
Based on https://github.com/systemd/systemd/pull/8032/files#diff-787108d33a7a2bd28d22e0a606901037R2651, if somebody does any one of:
$ docker logs 997acb8 | less
$ docker logs -f 87ab356 | less
Where the log output pauses for a "long time", won't that prevent both the sd_journal_wait()
and sd_journal_process()
calls from occurring? If so, I think we can still leak file descriptors.
As such, I believe we need a thread that periodically listens for inotify events and calls sd_journal_process()
.
Alright, just updated to have drainJournal
always call sd_journal_process()
, and to be more careful about making sure that cursor values it returns always point to the last journal entry that it sent, which should fix the occasional single-message-drop I'd been seeing.
Rebased.
LGTM
- What I did This is taking a stab at fixing the open-file-handles-for-deleted-journal-files bug at https://bugzilla.redhat.com/show_bug.cgi?id=1534827.
- How I did it We weren't handling SD_JOURNAL_INVALIDATE results coming back from sd_journal_process() before, so I added logic to close and reopen the journal handle when we get that result.
- How to verify it Run
docker logs -f
to tail the journaled output from a container that was started with--log-driver journald
, and consult the list of open descriptors in the directory under/proc/PID/fd
which corresponds to the docker process to see which journal files it has open. Whiledocker logs -f
is running, usejournalctl --vacuum-size
orjournalctl --vacuum-files
to force the removal of one or more older journal files whichdockerd
had open, then consult itsfd
directory again: it should no longer have open descriptors for files which were removed (i.e., the kernel should not be noting any journal files which it has open as(deleted)
).- Description for the changelog
When following a journal in the journald log reader, correctly handle SD_JOURNAL_INVALIDATE results from
sd_journal_process()
by closing and reopening the journal handle, which closes all open journal files (including deleted ones) and opening all currently-available journal files.