knorrie / python-btrfs

Python Btrfs module
GNU Lesser General Public License v3.0
112 stars 22 forks source link

Subvolumes (!= '/' or '/@') break munin plugin #8

Closed linuxrrze closed 7 years ago

linuxrrze commented 7 years ago

Hi there! I just found python-btrfs (v8) and enabled the munin plugin on some test nodes.

While working fine on a few nodes, I got problems on my workstation:

linux # /etc/munin/plugins/btrfs_usage 
Traceback (most recent call last):
  File "/etc/munin/plugins/btrfs_usage", line 119, in <module>
    main()
  File "/etc/munin/plugins/btrfs_usage", line 106, in main
    for fs in btrfs.utils.mounted_filesystems():
  File "/usr/lib/python3/dist-packages/btrfs/utils.py", line 34, in mounted_filesystems
    fs = btrfs.ctree.FileSystem(path)
  File "/usr/lib/python3/dist-packages/btrfs/ctree.py", line 461, in __init__
    _fs_info = self.fs_info()
  File "/usr/lib/python3/dist-packages/btrfs/ctree.py", line 467, in fs_info
    return btrfs.ioctl.fs_info(self.fd)
  File "/usr/lib/python3/dist-packages/btrfs/ioctl.py", line 113, in fs_info
    fcntl.ioctl(fd, IOC_FS_INFO, buf)
OSError: [Errno 25] Inappropriate ioctl for device

So I took a look at the code (that's reading /proc/self/mounts to get btrfs mounted filesystems).

Output in case of the broken machine is this

linux # cat /proc/self/mounts | grep " btrfs "
/dev/mapper/linux9--vg-root / btrfs rw,relatime,ssd,space_cache,subvolid=257,subvol=/@ 0 0
/dev/mapper/linux9--vg-root /home btrfs rw,relatime,ssd,space_cache,subvolid=258,subvol=/@home 0 0
/dev/sdd1 /proj.stand/speed btrfs rw,relatime,nodatasum,nodatacow,ssd,space_cache,subvolid=5,subvol=/ 0 0
/dev/sda1 /proj.stand/scratch btrfs rw,relatime,space_cache,subvolid=5,subvol=/ 0 0
/dev/mapper/linux9--vg-root /var/lib/docker/btrfs btrfs rw,relatime,ssd,space_cache,subvolid=257,subvol=/@/var/lib/docker/btrfs 0 0
/dev/mapper/crypto /proj.stand/scratch/backup-mnt btrfs rw,relatime,space_cache,subvolid=5,subvol=/ 0 0

Only thing different on my workstation is the usage of (non-standard) subvols. (Machine is Ubuntu 16.04 with /home (and docker) on separate subvol.)

So I extended the applied filtering to only allow subvols / and /@ and the plugin started working.

Diff is here:

linux # diff /usr/lib/python3/dist-packages/btrfs/utils.py  /usr/lib/python3/dist-packages/btrfs/utils.py.ok
33c33,34
<     for path in [mount[1] for mount in mounts if mount[2] == 'btrfs']:
---
>     for path in [mount[1] for mount in mounts if (mount[2] == 'btrfs' and mount[3].endswith('subvol=/') or mount[3].endswith('subvol=/@') )]:

Maybe the other subvolumes just need to be handled differently (or do not provide any additional information anyway). Hope this helps to fix this behaviour.

knorrie commented 7 years ago

Thanks for testing!

Can you put a print statement in to see which path triggers the error? I'm a bit puzzled about what's happening here, since any mount point where any btrfs is mounted (regardless of the subvolume) should answer to the fs_info ioctl.

What I can think of is a race condition scenario, where after getting the info from mounts something gets unmounted, after which the code tries to execute a btrfs ioctl on the mountpoint, which is then another filesystem.

knorrie commented 7 years ago

FYI, I have several filesystems with subvolumes mounted on different locations, on which the code runs just fine. After getting the list, the fsid uuids are compared to deduplicate them, so that you get only one graph per filesystem.

linuxrrze commented 7 years ago

I just added a try/except statement in mounted_filesystems() (utils.py):

According to this (only) my /home btrfs subvolume causes the error:

[Errno 25] Inappropriate ioctl for device

linux # cat /proc/self/mounts | grep /home
/dev/mapper/linux9--vg-root /home btrfs rw,relatime,ssd,space_cache,subvolid=258,subvol=/@home 0 0
linuxrrze commented 7 years ago

Just for completeness: for all btrfs filesystems reported by /sys btrfs_usage reports valid data:

linux # ls -ld /sys/fs/btrfs/*
drwxr-xr-x 5 root root 0 Jul 23 13:50 /sys/fs/btrfs/a224770c-68ef-4829-9497-850e528a155d
drwxr-xr-x 5 root root 0 Jul 23 13:50 /sys/fs/btrfs/aca34d32-d6ad-43bd-a9bb-5b8f273d56a7
drwxr-xr-x 5 root root 0 Jul 23 13:50 /sys/fs/btrfs/bf3614ff-dd89-4ff3-a0a3-f0c25cb7d80d
drwxr-xr-x 5 root root 0 Jul 23 13:50 /sys/fs/btrfs/d196dfb9-2268-4c8d-9648-87bd5a8d54ef
drwxr-xr-x 2 root root 0 Jul 23 13:50 /sys/fs/btrfs/features
knorrie commented 7 years ago

So, apparently when opening a file descriptor to /home and then calling the fs_info ioctl on that errors, while it is a btrfs filesystem.

That means that the following command (eliminating all python code here from the equasion) would also have problems, since it does the same.

btrfs fi usage /home

If you strace this, you should see lines like:

open("/home", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
[...]
ioctl(3, BTRFS_IOC_FS_INFO, {max_id=1, num_devices=1, fsid=[...some uuid...], nodesize=16384, sectorsize=4096, clone_alignment=4096}) = 0

If that would succeed, then the smallest test case possible in python that already triggers the fs_info ioctl would just be this:

-# python3
>>> import btrfs
>>> fs = btrfs.FileSystem('/home')
knorrie commented 7 years ago

What kernel / machine architecture is this by the way (uname -a output?).

linuxrrze commented 7 years ago

I have to excuse: Your latest comments made me notice my (rather obvious) error:

I once installed my workstation with a local /home directory (on btrfs -> @home), however this was changed to an automounter based solution in the meantime.

So /home is still mounted during boot (btrts @home, based on fstab entry), however later on autofs takes over /home (and mounts something completely different here).

Mounting the subvolume to a different path and doing an btrfs fi usage works quite fine, while doing the same to autofs /home fails miserably.

I should have noticed this much earlier - sorry again for all the (unnecessary) mess.