openaustralia / morph

Take the hassle out of web scraping
https://morph.io
GNU Affero General Public License v3.0
461 stars 74 forks source link

Add a script to clean up tmp directories in Docker AUFS layers #1117

Closed auxesis closed 7 years ago

auxesis commented 7 years ago

Per the Docker AUFS bug affecting us in #1104, disk space is slowly being eaten up by containers that aren't being cleaned up by a docker system prune --all --force.

Ideally we could just nuke all AUFS data in /var/lib/docker/aufs/diff, but the Morph box is running two types of workloads: scrapers, and long running processes. If we blindly nuke the AUFS data, we may accidentally nuke persistent data from a long running process (like a Discourse upload, or a Postgres database 😬 ).

We can take advantage of the UNIX convention that data stored in /tmp is ephemeral, by only nuking /tmp directories in any image stored in /var/lib/docker/aufs/diff. Any containers working with data in /tmp should generally recover seamlessly, and this gives us a significant space saving.

By running this script once, I managed to recover ~140GB of disk space.

image

This script finds any tmp directories under filesystem layers in /var/lib/docker/aufs/diffs with more than 1MB of usage, and deletes them.

It also prints out the number of files deleted, and the amount of space recovered by running the script.

This PR installs the script as a cron job that runs once a day.

auxesis commented 7 years ago

I'm closing this, because it turns out: