Open breed808 opened 4 years ago
Additionally, I've run the exporter with a debugger and the CollectFromLogLine function is not reached.
I believe postfix_exporter currently only reads the log once, and stops on first 0-length read.
I co-maintain the Debian package prometheus-postfix-exporter, and have also just recently discovered that systemd journal support inexplicably broke somewhere between v0.2.0 and v0.3.0. This is somewhat disappointing, since v0.2.0 was working quite reliably.
From the minimal debugging that I've done so far, it appears to bail out of the SystemdLogSource.Read()
function with io.EOF when s.journal.Next()
returns zero, and never actually calls s.journal.GetEntry()
func (s *SystemdLogSource) Read(ctx context.Context) (string, error) {
c, err := s.journal.Next()
if err != nil {
return "", err
}
if c == 0 {
return "", io.EOF
}
e, err := s.journal.GetEntry()
...
That subsequently causes the for-loop in PostfixExporter.StartMetricCollection()
to bail out, and that's pretty much game over.
By commenting out the "Start at end of journal" seek in logsource_systemd.go, I can get the exporter to "replay" historical systemd journal entries, and it appears to produce the expected metrics. However, when it reaches the end of the events, the Read()
function still bails out with io.EOF. This seems to be the main issue - it doesn't wait for further events, and if it is allowed to seek to the end of the journal (i.e., unmodified code from the 0.3.0 tag), it will immediately bail out as there are no events to read.
This might also be an issue with recent versions of go-systemd: https://github.com/coreos/go-systemd/issues/392
I did some digging today because this bothered me a bit:
SeekRealtimeUsec
and then it runs into an EOF if nothing else is written to the journal in the meantime. Removing the return
fixes the problem and seems to work fine for me.GetEntry
from go-systemd
hangs, but I'm not sure here what's wrong so far.
When running
postfix_exporter
built from master,journald
metrics are not available, and there is only a single path for thepostfix_up
metric present when querying the exporter. The exporter is printing "Reading log events from systemd" on startup.Bisecting with git reveals that commit 26d06428312ac8cbf2dfb9d917f85ec0057035f1 introduced the issue.