Massive disk space leaking until host runs out of disk space

Cydhra commented 3 years ago

The dokuwiki container grows in size indefinitelly until the host cannot provide more disk space (and thus fails to operate normally). In my current setup I only have about 12 GB available for this specific container and every few weeks this space is filled with 11-12 GB purely by the running container. The only way to free up this space is to delete the container and recreate it. The actual wiki content is saved on the host using a bind-mount (see command below), so it does not get lost. But either caching, or logging or something else leaks within the container.

Expected Behavior

The container should either offer to expose the path that leaks resources to the host or not leak at all.

Current Behavior

The container overlay file just grows until the host has no more disk space available, requiring a deletion and recreation of the container.

Steps to Reproduce

Run the container for a long time period (I included the commands I use below)

Environment

OS: Linux 5.6.18-300.fc32.x86_64 #1 SMP Wed Jun 10 21:38:25 UTC 2020 x86_64 GNU/Linux CPU architecture: x86_64 How docker service was installed: podman provided by the fedora repositories

Command used to create docker container (run/create/compose/screenshot)

podman create --name=dokuwiki -e PUID=1000 -e PGID=1000 -e TZ=Europe/Berlin -e APP_URL=/dokuwiki -p 8080:80 -v ./dokuwiki:/config:Z linuxserver/dokuwiki

podman start dokuwiki

Docker logs

no logs relevant to this issue.

github-actions[bot] commented 3 years ago

Thanks for opening your first issue here! Be sure to follow the bug or feature issue templates!

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Cydhra commented 3 years ago

Why is an issue marked as stale that has not even been acknowledged by the repository owner?

thespad commented 3 years ago

Do you have any additional plugins installed?

It's possible this is a podman-specific issue but I'll try and replicate it with docker.

If you currently have an affected container running can you please provide the output of du -hxd1 / from inside the container.

Cydhra commented 3 years ago

Only the blockquote plugin, and I had the issue before installing it.

Cydhra commented 3 years ago

If you currently have an affected container running can you please provide the output of du -hxd1 / from inside the container.

Okay I don't know why I haven't checked that myself yet (sorry). I traced the issue to the nginx error log (at this point 1.2 gb). The entire error.log file is just nginx trying to bind 0.0.0.0:443 and 0.0.0.0:80 every three seconds 5 times each and failing (Address in use):

2021/03/12 02:58:18 [emerg] 8571#8571: bind() to 0.0.0.0:80 failed (98: Address in use)
2021/03/12 02:58:18 [emerg] 8571#8571: bind() to 0.0.0.0:443 failed (98: Address in use)
2021/03/12 02:58:18 [emerg] 8571#8571: bind() to 0.0.0.0:80 failed (98: Address in use)
2021/03/12 02:58:18 [emerg] 8571#8571: bind() to 0.0.0.0:443 failed (98: Address in use)
2021/03/12 02:58:18 [emerg] 8571#8571: bind() to 0.0.0.0:80 failed (98: Address in use)
2021/03/12 02:58:18 [emerg] 8571#8571: bind() to 0.0.0.0:443 failed (98: Address in use)
2021/03/12 02:58:18 [emerg] 8571#8571: bind() to 0.0.0.0:80 failed (98: Address in use)
2021/03/12 02:58:18 [emerg] 8571#8571: bind() to 0.0.0.0:443 failed (98: Address in use)
2021/03/12 02:58:18 [emerg] 8571#8571: bind() to 0.0.0.0:80 failed (98: Address in use)
2021/03/12 02:58:18 [emerg] 8571#8571: bind() to 0.0.0.0:443 failed (98: Address in use)
2021/03/12 02:58:18 [emerg] 8571#8571: still could not bind()

Interestingly, dokuwiki is still reachable. I forwarded port 80 to host port 8080 and I can perfectly reach the container from my network. So I am a bit confused which server tries to bind port 80 and 443 then. Since the PID increases everytime this happens (apparently nginx is restarted every time?), I listed running processes with pid and found that there is a running instance of nginx, as well as a separate instance that apparently causes the issue (judging from its insanely high PID).

    PID COMMAND
 196012 ps axfo pid,args
      1 s6-svscan -t0 /var/run/s6/services
     27 s6-supervise s6-fdholderd
    317 s6-supervise nginx
 196003  \_ /bin/bash /usr/bin/with-contenv bash ./run
 196004      \_ /usr/sbin/nginx -c /config/nginx/nginx.conf
    316 s6-supervise php-fpm
    318 s6-supervise cron
    321  \_ /bin/bash /usr/bin/with-contenv bash ./run
    322      \_ /usr/sbin/crond -f -S -l 5 -c /etc/crontabs
    325 php-fpm: master process (/etc/php7/php-fpm.conf)
    349  \_ php-fpm: pool www
    348  \_ php-fpm: pool www
    350  \_ php-fpm: pool www
    328 nginx: master process /usr/sbin/nginx -c /config/nginx/nginx.conf
    347  \_ nginx: worker process
    345  \_ nginx: worker process
    344  \_ nginx: worker process
    346  \_ nginx: worker process

thespad commented 3 years ago

That's weird because my test container that I spun up after I posted my last message doesn't have anything in the nginx error log and its processes are

    PID COMMAND
    659 bash
    680  \_ ps axfo pid,args
      1 s6-svscan -t0 /var/run/s6/services
     35 s6-supervise s6-fdholderd
    335 s6-supervise php-fpm
    338  \_ php-fpm: master process (/etc/php7/php-fpm.conf)
    364      \_ php-fpm: pool www
    363      \_ php-fpm: pool www
    333 s6-supervise nginx
    340  \_ nginx: master process /usr/sbin/nginx -c /config/nginx/nginx.conf
    361      \_ nginx: worker process
    360      \_ nginx: worker process
    362      \_ nginx: worker process
    359      \_ nginx: worker process
    336 s6-supervise cron
    341  \_ /usr/sbin/crond -f -S -l 5 -c /etc/crontabs

I'm wondering if podman is doing something weird that's breaking s6 because both nginx and phpfpm appear to be detatched from their supervise processes in your container.

j0nnymoe commented 3 years ago

Podman doesn't like containers running as root. I think there is some flag you can pass which allows podman to run as root.

Cydhra commented 3 years ago

I have now deleted both the dokuwiki container and its image, then recreated them. For now, the processes do not detach anymore from the s6 supervisor and no additional error logs are created. If the issue does not come up again, I'll assume that I ran an old version of the image that was faulty under podman.

However, since I redownloaded the image, I can no longer access the page, with nginx (the correct instance this time) outputting the following error:

2021/04/13 12:18:52 [error] 357#357: *1 FastCGI sent in stderr: "PHP message: PHP Fatal error:  Uncaught Error: Class 'dokuwiki\plugin\config\core\Setting\Setting' not found in /app/dokuwiki/inc/deprecated.php:61
Stack trace:
#0 /app/dokuwiki/inc/load.php(38): require_once()
#1 /app/dokuwiki/inc/init.php(200): require_once('/app/dokuwiki/i...')
#2 /app/dokuwiki/doku.php(36): require_once('/app/dokuwiki/i...')
#3 {main}
  thrown in /app/dokuwiki/inc/deprecated.php on line 61" while reading response header from upstream, client: 192.168.88.253, server: _, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "192.168.88.243:8080"

I'm assuming I have an old version of a plugin installed that is now trying to run the new image version, because it is stored in my bind-mount on my host. What is the correct way to update the image without losing data?

thespad commented 3 years ago

There were a lot of breakages when they moved to their "Hogfather" release. I believe the best way to do it is backup the /config/dokuwiki/conf and /config/dokuwiki/data directories, clean your /config mount, run the container once to create the folder structure, then copy your directories back.

You can also try rolling back the image to one of the 2018-04-22c tags, use the "upgrade" plugin to upgrade your install inside the container, and then roll forwards to the latest image.

linuxserver / docker-dokuwiki