gdraheim / docker-systemctl-replacement

docker systemctl replacement - allows to deploy to systemd-controlled containers without starting an actual systemd daemon (e.g. centos7, ubuntu16)
European Union Public License 1.2
1.38k stars 398 forks source link

BlockingIOError: [Errno 11] Resource temporarily unavailable #176

Open Rid opened 7 months ago

Rid commented 7 months ago

We run this on multiple webtop containers (https://github.com/linuxserver/docker-webtop).

The containers can sometimes become IO constrained due to heavy IO. This causes systemctl3.py to exit with the following error:

    sys.exit(run(command, *modules))
  File "/usr/local/bin/systemctl", line 6502, in run
    exitcode = is_not_ok(systemctl.default_system())
  File "/usr/local/bin/systemctl", line 5722, in default_system
    return self.start_system_default(init = init)
  File "/usr/local/bin/systemctl", line 5732, in start_system_default
    sig = self.init_loop_until_stop(services)
  File "/usr/local/bin/systemctl", line 6087, in init_loop_until_stop
    self.read_log_files(units)
  File "/usr/local/bin/systemctl", line 5905, in read_log_files
    os.write(1, content)
BlockingIOError: [Errno 11] Resource temporarily unavailable

Once it exits, it's child processes are moved to PID 1 (we use s6 as the PID 1 and run systemctl as an s6 service to handle systemd services for convenience).

systemctl3.py then enters an infinite loop of trying to start the services which are already running. They exit immediately due to PID file lock contention, and then systemctl3.py tries to start them again.

This infinite loop uses 100% CPU:

 753301 root      20   0  259484  97988   7180 R  99.7   0.0   4:58.09 systemctl

Is there any way we can remove the IO timeout? Having systemctl hang is much more preferable to an exit, as the contention will eventually subside and resources will become available again.

Rid commented 7 months ago

Perhaps if IO is overloaded you could just discard the output, that would be better than a hard exit, for example:

def read_log_files(self, units):
    BUFSIZE = 8192
    for unit in units:
        if unit in self._log_file:
            new_text = b""
            while True:
                buf = os.read(self._log_file[unit], BUFSIZE)
                if not buf:
                    break
                new_text += buf
            text = self._log_hold[unit] + new_text
            if not text:
                continue
            lines = text.split(b"\n")
            if not text.endswith(b"\n"):
                self._log_hold[unit] = lines[-1]
                lines = lines[:-1]
            for line in lines:
                prefix = unit.encode("utf-8")
                content = prefix + b": " + line + b"\n"
                try:
                    os.write(1, content)
                    try:
                        os.fsync(1)
                    except Exception:
                        pass
                except BlockingIOError:
                    pass 
Rid commented 7 months ago

Sorry, looking at this a bit further it seems that systemctl might be contributing to the high IO. I applied the above patch so that it's no longer exiting, but where there are failing servicehelpers it seems to be producing excessive log outputs such as the below.

Unfortunately we do not have full control over the services inside the container as it's used much like a VPS (with strict security policies).

Should there be a back-off and then eventual failure for these servicehelpers? or can we rate-limit them?

strace: Process 3231129 attached
read(3, " [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15262' (uid=0 pid=386450 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15262' (uid=0 pid=386450 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15263' (uid=0 pid=386699 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15263' (u"..., 8192) = 8192
mmap(NULL, 75100160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7a5106800000
munmap(0x7a510b000000, 75091968)        = 0
read(3, "us-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15275' (uid=0 pid=389750 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15276' (uid=0 pid=389982 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15276' (uid=0 pid=389982 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested "..., 8192) = 8192
mmap(NULL, 75108352, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7a510f800000
munmap(0x7a5106800000, 75100160)        = 0
read(3, "return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15289' (uid=0 pid=393387 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15289' (uid=0 pid=393387 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15290' (uid=0 pid=393652 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timed"..., 8192) = 8192
mmap(NULL, 75116544, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7a510b000000
munmap(0x7a510f800000, 75108352)        = 0
read(3, "ed with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15302' (uid=0 pid=396348 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15303' (uid=0 pid=396611 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15303' (uid=0 pid=396611 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.f"..., 8192) = 8192
mmap(NULL, 75124736, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7a5106800000
munmap(0x7a510b000000, 75116544)        = 0
read(3, "unch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15316' (uid=0 pid=399622 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15316' (uid=0 pid=399622 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15317' (uid=0 pid=399841 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating ser"..., 8192) = 8192
mmap(NULL, 75132928, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7a510f800000
munmap(0x7a5106800000, 75124736)        = 0
read(3, "ate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15329' (uid=0 pid=402640 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15330' (uid=0 pid=402851 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15330' (uid=0 pid=402851 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system"..., 8192) = 8192
mmap(NULL, 75141120, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7a510b000000
munmap(0x7a510f800000, 75132928)        = 0
read(3, "reedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15343' (uid=0 pid=405936 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15343' (uid=0 pid=405936 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15344' (uid=0 pid=406154 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemo"..., 8192) = 8192
mmap(NULL, 75149312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7a5106800000
munmap(0x7a510b000000, 75141120)        = 0
read(3, "d service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15356' (uid=0 pid=408942 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15357' (uid=0 pid=409205 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return code 1\ndbus-daemon[1068]: [system] Activating service name='org.freedesktop.timedate1' requested by ':1.15357' (uid=0 pid=409205 comm=\"/usr/lib/snapd/snapd \") (using servicehelper)\ndbus-daemon[1068]: [system] Activated service 'org.freedesktop.timedate1' failed: Launch helper exited with unknown return c"..., 8192) = 8192
gdraheim commented 7 months ago

The proposed patch is correct anyway.

But what I see is a number of services that can not start in a container anyway. There are some "igno_" lists in the python-file that excludes some enabled units from being included in the init-list of modules. Do we need to extend that list?

Additionally, try to run the container as a command-shell and check what services are included in the init-list

systemctl3.py default-services

If you see anything that looks like coming from the original systemd then it should be kicked out.

Rid commented 7 months ago

Thanks, here's the output:

root@ubtvnc:/# systemctl default-services
appbox_dbus.service
cron.service
nginx.service
panel.service
php8.1-fpm.service
snapd.aa-prompt-listener.service
snapd.apparmor.service
snapd.autoimport.service
snapd.core-fixup.service
snapd.recovery-chooser-trigger.service
snapd.seeded.service
snapd.service
ssh.service
vnstat.service
cups.service
docker.service
fail2ban.service
plymouth.service
pulseaudio-enable-autospawn.service
rsync.service
unattended-upgrades.service
uuidd.service

Is snapd already included the in ignore list?

gdraheim commented 7 months ago

I never had containerized a snapd app. Probably kill the appboxdbus service as well as I am suspicious that it runs addtional services. May be you can test a bit with adding patterns to the igno-lists?

Rid commented 7 months ago

Snapd doesn't work in docker as it requires real systemd, so we can add

    igno_always = ["network*", "dbus*", "systemd-*", "kdump*", "kmod*", "snapd*"]

However some really big journal logs have been created for example:

-rw-r--r--+ 1 root systemd-journal  92M Mar 21 09:37 snapd.service.log

It seems now like it's using 100% CPU reading these log files as I can see it reading errors which have been resolved. Sorry for my ignorance, I'm not a systemd expert, but what's the reason we're reading the whole log file?

EDIT: Testing this on a few other containers with the issue, deleting the log files reduces the CPU usage from 100% to 0.7% even when services are failing, so this seems like the real issue regarding the resource usage.

Rid commented 7 months ago

Setting the buffersize to 10MB resolves the issue:

    def read_log_files(self, units):
        BUFSIZE=10485760

Are logs ever rotated? I know on normal systemd systemd-journald rotates the journal files

EDIT: Looks like it's not implemented (https://github.com/gdraheim/docker-systemctl-replacement/issues/41) are there any work arounds for this other than running a custom service to just rotate the log files (we would need to add jitter as we run many containers per node)?

gdraheim commented 7 months ago

Logs are not rotated, you are right, and the read-log-loop is about duplicating the log lines from each application service into the container log (docker log), so that messages from the applications become visible outside the container.

The current implementation runs the service processes with the log files attached, so it is not really possible to just rotate them as systemd does.

There has been just an idea floating around to punch holes in the log files aka sparse files. It may be possible to analyse systemd-journald taking over some idea. But I did not try so far. In reality it was good enough to fix the problems on the applications themselves that led to too many output lines.

Rid commented 7 months ago

Thanks @gdraheim, so I could use fallocate to in essence delete the logs from the offset (0) to the length? I could setup a service to do this once per day and it would suffice.

gdraheim commented 7 months ago

Well, that sounds like a reasonable workaround. I hope that one day, the systemctl initloop could throw away parts beyond a certain limit.

Rid commented 7 months ago

I am thinking to just extend the code in a fork to add a function to fallocate the log files if they're over 5MB (and truncate them to 5MB), I would add a last run variable to check if it has run during the current day, and if not schedule it for a random time.

it's probably too opinionated to be merged, but I can make a PR if you're interested.