Closed fifofonix closed 3 years ago
Non-working and working systemctl -version
output (required apt-get install systemd
in promtail containers)
Note +ZSTD in non-working host view.
Non-working Host View
systemd 246 (v246.6-3.fc33)
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified
Non-Working Container View
systemd 232
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN
Working Host View:
systemd 245 (v245.8-2.fc32)
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified
Working Container View:
systemd 232
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN
Are you sure /var/log/journal
is populated? By default logs are stored in /var/run/journal
unless /var/log/journal
exists.
See https://www.freedesktop.org/software/systemd/man/systemd-journald.service.html#Description :
The journal service stores log data either persistently below /var/log/journal or in a volatile way below /run/log/journal/ (in the latter case it is lost at reboot). By default, log data is stored persistently if /var/log/journal/ exists during boot, with an implicit fallback to volatile storage otherwise. Use Storage= in journald.conf(5) to configure where log data is placed, independently of the existence of /var/log/journal/.
Yes /var/log/journal is most definitely populated, and has a current timestamp on both old working nodes, and new non-working nodes. There is a /var/run/log/journal directory but it is stale according to its timestamp on both working and non-working configurations.
I think any Fedora33 based system will no longer be able to use promtail for journald until this is addressed. @slim-bean any recommendations on how this issue should be handled, or recommendations for Fedora/promtail users? Since Fedora is upstream to Redhat and Openshift (via OKD & FedoraCoreOS) I am imagining the impact is going to broaden.
I was hoping that rebuilding promtail from source for buster, backporting libsystemd-dev which then would be on the right systemd version (246) with the correct flags for FedoraCoreOS (including especially +ZSTD) would solve things. Indeed Loki debug trace reveals correct reading of Journald log lines...but it seems changes would also be required to the Go program as it is giving a stack trace when attempting the send to Loki.
This issue continues to manifest itself on Promtail 2.0.0 with Loki 2.0.0. Original description updated to reflect this.
Good news. Rebuilding promtail from source for buster, backporting libsystemd-dev does in fact solve things providing one sets the BUILD_IN_CONTAINER switch to force rebuild of the promtail code itself against backported C modules (missing this switch was my earlier 'muppet' mistake).
It is not clear to me exactly how this should be fixed in the project itself. It would seem that the right end goal should be an additional tag available in dockerhub like promtail:2.0.0-systemd-246?
For the record the diff of the changes made on the v2.0.0 tag to the promtail/dockerfile are:
-FROM golang:1.14.2 as build
+FROM golang:1.14.2-buster as build
# TOUCH_PROTOS signifies if we should touch the compiled proto files and thus not regenerate them.
# This is helpful when file system timestamps can't be trusted with make
ARG TOUCH_PROTOS
COPY . /src/loki
WORKDIR /src/loki
-RUN apt-get update && apt-get install -qy libsystemd-dev
-RUN make clean && (if [ "${TOUCH_PROTOS}" ]; then make touch-protos; fi) && make BUILD_IN_CONTAINER=false promtail
+RUN echo "deb http://deb.debian.org/debian buster-backports main" >> /etc/apt/sources.list
+RUN apt-get update && apt-get install -t buster-backports -qy libsystemd-dev
+RUN make clean && (if [ "${TOUCH_PROTOS}" ]; then make touch-protos; fi) && make BUILD_IN_CONTAINER=true promtail
# Promtail requires debian as the base image to support systemd journal reading
-FROM debian:stretch-slim
+FROM debian:buster-slim
# tzdata required for the timestamp stage to work
+RUN echo "deb http://deb.debian.org/debian buster-backports main" >> /etc/apt/sources.list
RUN apt-get update && \
apt-get install -qy \
- tzdata ca-certificates libsystemd-dev && \
+ tzdata ca-certificates
+RUN apt-get install -t buster-backports -qy libsystemd-dev && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
COPY --from=build /src/loki/cmd/promtail/promtail /usr/bin/promtail
COPY cmd/promtail/promtail-docker-config.yaml /etc/promtail/config.yml
I'm a little confused following this, why is the backport necessary? Is it enough to just update the build image and container image to buster?
My understanding is that buster ships with systemd version v241 which shows the feature flags listed below. These do not include a v246 feature flag +ZSTD which is present in Fedora33 and apparently references a compression algorithm (https://facebook.github.io/zstd/).
Backporting the newer v246 to buster therefore seems necessary to ensure that the loki go project, which uses the systemd go project, which uses native systemd C apis to access journald, does so correctly. I did experiment with switching journald compression off altogether but this did not fix things, and in any case is non-ideal. Disclaimer however that I have never played with journald configuration before so it is possible I made a mistake with this.
% systemctl --version
systemd 241 (241)
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid
Fedora of course is upstream to Redhat but also FedoraCoreOS which is upstream to the Open Kubernetes Distribution (OKD) which is upstream to OpenShift. So, I guess this is of significance to a fairly wide base. FedoraCoreOS are running their 'next' stream with Fedora33 already to shake out issues like these.
Same problem for ArchLinux:
systemctl --version
systemd 246 (246.6-1.1-arch)
+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid
Systemd journals created with 246+ are deliberately incompatible with earlier versions. The backport is required to ship those logs as the C linkage needs to be up to date here.
@cyriltovena do you mind looking at this, please?
Will take a look at this tomorrow and get this merged, thanks everyone, apologies for delay
This issue now manifesting itself on the 'testing' FedoraCoreOS stream with an 11/30 timetable to go to the 'stable' stream at which point the impact of this issue will become more widespread.
working on this now, should be in a release early next week
@fifofonix can you (or anyone else) test out this image to see if it works:
grafana/promtail:update-libsystem-dev-0bd2ca6
I verified it worked on our existing systems but I don't have a newer system handy to make sure if fixes the original issue.
oh hrm where you testing with the docker image or a binary?
Looks good right now. I am running the container you built on both old (FCOS Fedora 32 w/systemd 245) and new (FCOS Fedora 33 w/systemd 246 with +ZSTD compression) servers and journald logs are aggregating from both. I'm going to leave those running and continue to monitor.
I'm testing image grafana/promtail:update-libsystem-dev-0bd2ca6
on Fedora 33 Workstation and it looks good so far.
grafana/promtail:update-libsystem-dev-0bd2ca6
works fine on NixOS 20.09.
grafana/promtail:update-libsystem-dev-0bd2ca6
confirmed working for an up-to-date Arch Linux, too.
Thanks everyone for testing this!
Hey @slim-bean I'm seeing what may be a similar issue just running on docker--if the file is created after promtail starts, it says it can't be found. If the file is created before, it ingests the existing logs but doesn't tail.
I appear to have run into this issue using the grafana/promtail:2.4.2-arm
docker image on balena os.
Balena OS recently updated their systemd
systemd 244 (244.5+)
-PAM -AUDIT -SELINUX +IMA -APPARMOR -SMACK +SYSVINIT +UTMP -LIBCRYPTSETUP -GCRYPT -GNUTLS +ACL +XZ -LZ4 -SECCOMP +BLKID -ELFUTILS +KMOD -IDN2 -IDN -PCRE2 default-hierarchy=hybrid
systemd 249 (249.3+)
-PAM -AUDIT -SELINUX -APPARMOR +IMA -SMACK +SECCOMP -GCRYPT -GNUTLS -OPENSSL +ACL +BLKID -CURL -ELFUTILS -FIDO2 -IDN2 -IDN -IPTC +KMOD -LIBCRYPTSETUP +LIBFDISK -PCRE2 -PWQUALITY -P11KIT -QRENCODE -BZIP2 -LZ4 -XZ -ZLIB +ZSTD -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=hybrid
Is there an official ARM docker image that addresses this issue?
Upon further inspection, it looks like the related pull request #2957 only touched clients/cmd/promtail/Dockerfile
(cmd/promtail/Dockerfile
at time of merge), and the arm image appears to be built using either Dockerfile.arm32
or Dockerfile.cross
the promtail:2.4.2-arm
is the 32bit image and would be built with Dockerfile.arm32
Promtail 1.6.1 and 2.0.0 docker images have ceased forwarding Journald when running on 'next' version of FedoraCoreOS which is based on upstream Fedora 33. Issue reported via FCOS issues and suggestion is that there may now be an incompatibility in journald compression level (FedoraCoreOS systemd now with +ZSTD compression).
Steps to reproduce the behavior:
Expect to see verbose logging in the promtail logs associated with relay of journald (like we see on pre Fedora33 FedoraCoreOS servers). Instead we only see occasional debug messages regarding the local log file config (expected).
Environment:
Promtail Config:
Sample Promtail Log