containerd / zfs

ZFS snapshotter plugin for containerd
Apache License 2.0
67 stars 29 forks source link

ZFS datasets not cleaned up #70

Closed Jeroen0494 closed 1 year ago

Jeroen0494 commented 1 year ago

Hi,

It seems ZFS datasets from old containers aren't cleaned up by the snapshotter. Every time I reboot my server, new datasets are created for the new containers, and the old ones aren't cleaned up.

Right now I use the following script every now and then to cleanup old datasets:

# Get currently mounted ZFS filesystems
mount | grep -oP "rpool/containerd/[0-9]+" > zfs_mounts.txt

# Get ZFS datasets
zfs list -H -r rpool/containerd -o name | grep -oP "rpool/containerd/[0-9]+" > zfs_datasets.txt

# Compare
comm -13 zfs_mounts.txt zfs_datasets.txt > zfs_to_delete.txt

# Perform cleanup
cat zfs_to_delete.txt | xargs -I {} zfs destroy -r {}

Also the number used for the datasets is ever increasing, old numbers aren't reused. Also because of this bug, my datasets are well in the 90.000 now.

AkihiroSuda commented 1 year ago

Which version of containerd?

Jeroen0494 commented 1 year ago

Hi,

The latest version supported by Ubuntu 22.04:

jeroen@mediaserver:~$ containerd --version
containerd github.com/containerd/containerd 1.5.9-0ubuntu3.1 
jeroen@mediaserver:~$ cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
Jeroen0494 commented 1 year ago

I setup the Docker repository and am now running containerd version 1.6.14-1. It seems to work with Kubernetes v1.25.5+k3s1, but let's see how this one goes over a couple of reboots.

Jeroen0494 commented 1 year ago

After using containerd 1.6.14-1 for a while now, I can confirm this bug is still present. It takes a long time to boot my system because grub evaluates all the datasets. The total number of datasets keep growing.

$ containerd --version
containerd containerd.io 1.6.14 9ba4b250366a5ddde94bb7c9d1def331423aa323
morganchristiansson commented 1 year ago

I have 37,000+ snapshots and takes ~20 minutes to boot my server. Containers take forever to start seemingly because of zfs clone being slow.

Using containerd with k3s kubernetes, not many containers running on this node which is mainly for storage.

Nuking /var/lib/containerd and letting k3s re-create all containers made things fast again!

Jeroen0494 commented 1 year ago

It seems containerd 1.6.14 9ba4b250366a5ddde94bb7c9d1def331423aa323 fixes my issues. Running a manual container image pruning and removing NotReady containers cleans up most of the datasets. It just takes a while before this plugin starts cleaning up old datasets.

Jeroen0494 commented 1 year ago

@morganchristiansson Do you still experience this issue with container 1.6.14? Otherwise I'll close this issue.

morganchristiansson commented 1 year ago

I only notice on reboot which is infrequent and I moved /var/lib/containerd to faster SSD drive.

I'm using Ubuntu and latest version is containerd 1.6.12 is this enough? https://launchpad.net/ubuntu/+source/containerd

I guess I'll need to switch to docker repository https://docs.docker.com/engine/install/ubuntu/

I'll get back yo you...

morganchristiansson commented 1 year ago

I am running sanoid which has created snapshots of all datasets which is causing deletion to fail. Have configured sanoid to not create snapshots will see how it goes.

Mar 20 01:32:30 morgan-server containerd[1984171]: time="2023-03-20T01:32:30.956166202+01:00" level=warning msg="snapshot garbage collection failed" error="exit status 1: \"/usr/sbin/zfs fs destroy puma/containerd/301\" => cannot destroy 'puma/containerd/301': filesystem has children\nuse '-r' to destroy the following datasets:\npuma/containerd/301@autosnap_2023-03-16_08:17:55_hourly\npuma/containerd/301@autosnap_2023-03-15_16:18:09_hourly\npuma/containerd/301@autosnap_2023-03-14_22:33:04_daily\npuma/containerd/301@autosnap_2023-03-07_22:17:10_daily\npuma/containerd/301@autosnap_2023-03-15_23:03:44_hourly\npuma/containerd/301@autosnap_2023-03-16_12:07:45_hourly\npuma/containerd/301@autosnap_2023-03-07_22:17:10_monthly\npuma/containerd/301@autosnap_2023-03-14_04:21:14_daily\npuma/containerd/301@autosnap_2023-03-14_14:49:23_daily\npuma/containerd/301@autosnap_2023-03-16_16:06:51_hourly\npuma/containerd/301@autosnap_2023-03-13_10:28:41_daily\npuma/containerd/301@autosnap_2023-03-12_12:19:49_daily\npuma/containerd/301@autosnap_2023-03-14_20:41:54_daily\npuma/containerd/301@autosnap_2023-03-13_11:48:18_monthly\npuma/containerd/301@autosnap_2023-03-15_18:03:45_hourly\npuma/containerd/301@autosnap_2023-03-16_17:04:45_hourly\npuma/containerd/301@autosnap_2023-03-10_00:17:30_daily\npuma/containerd/301@autosnap_2023-03-16_01:33:56_monthly\npuma/containerd/301@autosnap_2023-03-14_11:42:29_monthly\npuma/containerd/301@autosnap_2023-03-16_03:38:15_monthly\npuma/containerd/301@autosnap_2023-03-10_10:15:25_monthly\npuma/containerd/301@autosnap_2023-03-10_22:31:32_daily\npuma/containerd/301@autosnap_2023-03-16_00:06:44_hourly\npuma/containerd/301@autosnap_2023-03-12_12:19:49_monthly\npuma/containerd/301@autosnap_2023-03-16_15:04:13_hourly\npuma/containerd/301@autosnap_2023-03-16_02:05:52_hourly\npuma/containerd/301@autosnap_2023-03-16_03:03:27_hourly\npuma/containerd/301@autosnap_2023-03-15_00:03:00_daily\npuma/containerd/301@autosnap_2023-03-13_10:28:41_monthly\npuma/containerd/301@autosnap_2023-03-01_00:03:27_monthly\npuma/containerd/301@autosnap_2023-03-16_19:01:45_hourly\npuma/containerd/301@autosnap_2023-03-16_03:38:15_hourly\npuma/containerd/301@autosnap_2023-03-15_16:18:09_daily\npuma/containerd/301@autosnap_2023-03-16_01:04:54_hourly\npuma/containerd/301@autosnap_2023-03-13_00:00:58_daily\npuma/containerd/301@autosnap_2023-03-11_03:30:28_daily\npuma/containerd/301@autosnap_2023-03-16_14:00:57_hourly\npuma/containerd/301@autosnap_2023-03-16_18:18:21_hourly\npuma/containerd/301@autosnap_2023-03-16_12:15:16_hourly\npuma/containerd/301@autosnap_2023-03-12_14:20:08_daily\npuma/containerd/301@autosnap_2023-03-11_00:02:10_daily\npuma/containerd/301@autosnap_2023-03-16_01:33:56_daily\npuma/containerd/301@autosnap_2023-03-11_03:30:28_monthly\npuma/containerd/301@autosnap_2023-03-14_22:16:22_daily\npuma/containerd/301@autosnap_2023-03-14_20:41:54_monthly\npuma/containerd/301@autosnap_2023-03-12_22:36:42_monthly\npuma/containerd/301@autosnap_2023-03-09_08:18:13_monthly\npuma/containerd/301@autosnap_2023-03-15_17:01:50_hourly\npuma/containerd/301@autosnap_2023-03-16_13:15:30_daily\npuma/containerd/301@autosnap_2023-03-15_16:18:09_monthly\npuma/containerd/301@autosnap_2023-03-14_11:42:29_daily\npuma/containerd/301@autosnap_2023-03-11_18:17:43_monthly\npuma/containerd/301@autosnap_2023-03-15_20:05:39_hourly\npuma/containerd/301@autosnap_2023-03-14_04:21:14_monthly\npuma/containerd/301@autosnap_2023-03-14_22:33:04_monthly\npuma/containerd/301@autosnap_2023-03-16_09:01:11_hourly\npuma/containerd/301@autosnap_2023-03-16_11:01:59_hourly\npuma/containerd/301@autosnap_2023-03-11_10:30:33_daily\npuma/containerd/301@autosnap_2023-03-12_00:01:14_daily\npuma/containerd/301@autosnap_2023-03-15_12:45:12_daily\npuma/containerd/301@autosnap_2023-03-16_04:03:20_hourly\npuma/containerd/301@autosnap_2023-03-14_22:16:22_monthly\npuma/containerd/301@autosnap_2023-03-10_00:17:30_monthly\npuma/containerd/301@autosnap_2023-03-17_02:00:17_hourly\npuma/containerd/301@autosnap_2023-03-16_03:38:15_daily\npuma/containerd/301@autosnap_2023-03-10_22:31:32_monthly\npuma/containerd/301@autosnap_2023-03-11_18:17:43_daily\npuma/containerd/301@autosnap_2023-03-15_21:01:58_hourly\npuma/containerd/301@autosnap_2023-03-16_05:05:22_hourly\npuma/containerd/301@autosnap_2023-03-12_14:20:08_monthly\npuma/containerd/301@autosnap_2023-03-16_11:01:59_daily\npuma/containerd/301@autosnap_2023-03-16_00:06:44_daily\npuma/containerd/301@autosnap_2023-03-15_16:01:28_hourly\npuma/containerd/301@autosnap_2023-03-16_23:01:53_hourly\npuma/containerd/301@autosnap_2023-03-16_09:01:11_monthly\npuma/containerd/301@autosnap_2023-03-10_18:34:03_monthly\npuma/containerd/301@autosnap_2023-03-13_11:48:18_daily\npuma/containerd/301@autosnap_2023-03-16_13:15:30_hourly\npuma/containerd/301@autosnap_2023-03-17_00:02:08_daily\npuma/containerd/301@autosnap_2023-03-16_22:00:34_hourly\npuma/containerd/301@autosnap_2023-03-10_14:15:44_monthly\npuma/containerd/301@autosnap_2023-03-16_08:02:58_hourly\npuma/containerd/301@autosnap_2023-03-16_21:07:36_hourly\npuma/containerd/301@autosnap_2023-03-10_00:02:33_daily\npuma/containerd/301@autosnap_2023-03-16_09:01:11_daily\npuma/containerd/301@autosnap_2023-03-15_12:45:12_monthly\npuma/containerd/301@autosnap_2023-03-08_00:02:38_daily\npuma/containerd/301@autosnap_2023-03-16_11:01:59_monthly\npuma/containerd/301@autosnap_2023-03-09_08:18:13_daily\npuma/containerd/301@autosnap_2023-03-16_20:01:57_hourly\npuma/containerd/301@autosnap_2023-03-09_00:05:04_daily\npuma/containerd/301@autosnap_2023-03-15_19:01:09_hourly\npuma/containerd/301@autosnap_2023-03-16_06:04:26_hourly\npuma/containerd/301@autosnap_2023-03-17_01:02:45_hourly\npuma/containerd/301@autosnap_2023-03-14_14:49:23_monthly\npuma/containerd/301@autosnap_2023-03-16_07:01:29_hourly\npuma/containerd/301@autosnap_2023-03-14_00:09:44_daily\npuma/containerd/301@autosnap_2023-03-17_00:02:08_hourly\npuma/containerd/301@autosnap_2023-03-10_14:15:44_daily\npuma/containerd/301@autosnap_2023-03-11_10:30:33_monthly\npuma/containerd/301@autosnap_2023-03-16_10:02:52_hourly\npuma/containerd/301@autosnap_2023-03-15_22:04:24_hourly\npuma/containerd/301@autosnap_2023-03-16_01:33:56_hourly\npuma/containerd/301@autosnap_2023-03-10_10:15:25_daily\npuma/containerd/301@autosnap_2023-03-10_09:35:26_daily\npuma/containerd/301@autosnap_2023-03-16_18:00:42_hourly\npuma/containerd/301@autosnap_2023-03-16_13:15:30_monthly\npuma/containerd/301@autosnap_2023-03-12_22:36:42_daily\npuma/containerd/301@autosnap_2023-03-10_18:34:03_daily\npuma/containerd/301@autosnap_2023-03-10_09:35:26_monthly\n" snapshotter=zfs

morganchristiansson commented 1 year ago

Seems to be OK now. It shrunk to just over 200 datasets under /var/lib/container - down from 2000+.