projectatomic / oci-systemd-hook

OCI hook to enable running systemd in a container
GNU General Public License v3.0
64 stars 26 forks source link

oci-systemd-hook leaks logs by default to container host /var/log/journald #76

Open karniemi opened 7 years ago

karniemi commented 7 years ago

For a long time I was looking at /var/log/journald growing in size beyond limits, with new sub-directories piling in that directory...

Found the reason: it's the docker-containers that are using systemd, together with RHEL dockerd and oci-systemd-hook. In CI test rounds we are running and killings tens of docker containers per day, and the killed docker-containers are leaving their journal logs hanging in /var/log/journald/ of the host system. What's worse, the host systems journal log system does not seem to rotate these hanging logs -nor does it count them when using "journalctl --disk-usage". The contradicting disk usage reported "du"-command and "journalctl --disk-usage" was making me mad as well(and maybe would be worth another issue report). Anyway, the killed docker-containers leave "/var/log/journald//system.journal"-named files on the host, and probably the journal log rotation rules consider them as open, so it does not clean them...which causes the host journal to grow beyond the configured limits.

Versions: oci-systemd-hook 1:0.1.8-4.1.gite533efa.el7 which was brought in as a dependency of docker-1.12.6-48.git0fdc778-el7

(I'm not quite sure if it's really been a wise decision to automatically "leak" journal logs from containers to host. Yes, it's a design decision from Red Hat -and I understand it is intentional. Still, containers were supposed to be isolated environments - so at least I would have preferred keeping the default as not to leak the journal logs(nor anything else) from containers to the host. Why make journal a special case for the simple host-container-isolation-paradigma?)

Unfortunately, the version of oci-systemd-hook shipped by Red Hat does not support '--env oci-systemd-hook=disabled' ... so the only workaround is to periodically clean up the /var/log/journal manually?

TomSweeneyRedHat commented 7 years ago

@karniemi thanks for taking the time to fill out this issue in very good detail.

The logs being put on the host from the containers was indeed a design decision that was requested by others. You're right about the switch not yet being in place on RHEL. However the switch was put into play primarily for Atomic hosts as the file "/usr/libexec/oci/hooks.d/oci-systemd-hook" can not be removed in that environment. On RHEL you should be able to simply remove that file or move it to another directory and that will also turn off oci-systemd-hook for you too.

Hope that helps!

karniemi commented 7 years ago

@TomSweeneyRedHat thanks for that possible workaround. It might serve someone who's having a critical issue with the logs piling up. Personally, I feel uneasy about removing files installed by rpms, and can live with cleaning up the host journal manually until there's a proper fix for this issue.

As far as I understand, there are two issues:

  1. killing a docker container is a normal supported operation for docker containers. It should not leave any hanging files on host.
  2. we have configured the dockerd to not log to host system journal. This is configurable in /etc/sysconfig/docker using the parameter "--log-driver". Red hat ships with "--log-driver=journald" as the default, but we've removed that phrase. So in our case the dockerd is configured to use the json-file as the logging-driver, because that's the default when not explicitly configured. It works properly for all the other containers -but not the systemd ones. I would call it a flaw that the log-driver configuration is not respected by oci-systemd-hook; oci-systemd-hook should only redirect journal from the container to the host if "--log-driver=journald". Right? Wrong? :-)

(That said, I now also understand and appreciate the value of oci-systemd-hook for those who want the containers' journal to the host - automatically and without the loose privileges that were earlier required without oci-systemd-hook . :-) )

rhatdan commented 7 years ago

The big difference between setting up a container runtime to log, is that they are only logging messages from stdout/stderr of the primary PID of the container. All messages that are written to /dev/log or directly to the journal are dropped. If you run systemd as PID1 inside of the container then these messages can be caught and recorded on the system.

We have to figure out the best way to handle these logs that can get left behind when the container is removed.

rhatdan commented 7 years ago

Opened a bugzilla on this issue.

https://bugzilla.redhat.com/show_bug.cgi?id=1493135

cgwalters commented 7 years ago

nspawn has:

--link-journal=

Control whether the container's journal shall be made visible to the host system. If enabled, allows viewing the container's journal files from the host (but not vice versa). Takes one of "no", "host", "try-host", "guest", "try-guest", "auto". If "no", the journal is not linked. If "host", the journal files are stored on the host file system (beneath /var/log/journal/machine-id) and the subdirectory is bind-mounted into the container at the same location. If "guest", the journal files are stored on the guest file system (beneath /var/log/journal/machine-id) and the subdirectory is symlinked into the host at the same location. "try-host" and "try-guest" do the same but do not fail if the host does not have persistent journaling enabled. If "auto" (the default), and the right subdirectory of /var/log/journal exists, it will be bind mounted into the container. If the subdirectory does not exist, no linking is performed. Effectively, booting a container once with "guest" or "host" will link the journal persistently if further on the default of "auto" is used.

Note that --link-journal=try-guest is the default if the systemd-nspawn@.service template unit file is used.

cgwalters commented 7 years ago

I think the persistent journaling should be opt-in. Particularly given that Docker is commonly driven by e.g. Kubernetes which is really all about transient and not "pet/elephant" containers.

cgwalters commented 7 years ago

However the switch was put into play primarily for Atomic hosts as the file "/usr/libexec/oci/hooks.d/oci-systemd-hook" can not be removed in that environment. On RHEL you should be able to simply remove that file or move it to another directory and that will also turn off oci-systemd-hook for you too.

BTW, I'm really trying to get people away from "you can't" for Atomic Host. In fact, you can - just ostree admin unlock. That's very explicitly transient for testing though - because removing it on a yum-based system also isn't reliably persistent - a yum update pulling in a new version of oci-systemd-hook.rpm will happily reinstate that file. Atomic Host/ostree is enforcing best practice, not about restricting users.