kdave / btrfs-progs

Development of userspace BTRFS tools
GNU General Public License v2.0
527 stars 239 forks source link

btrfs balance start --enqueue busy waits #746

Closed ssiloti closed 4 months ago

ssiloti commented 4 months ago

When a btrfs balance start --enqueue command is issued while another balance is in progress the enqueued process uses 100% of a CPU while waiting for the first balance to finish. I took a look with strace and found:

     0.000007 ioctl(3, BTRFS_IOC_FS_INFO, {max_id=7, num_devices=4, fsid=61cf73bc-b0cc-4667-ad1d-a3c6ac7ca23e, nodesize=16384, sectorsize=4096, clone_alignment=4096, flags=BTRFS_FS_INFO_FLAG_GENERATION|BTRFS_FS_INFO_FLAG_METADATA_UUID, generation=346672, metadata_uuid=61cf73bc-b0cc-4667-ad1d-a3c6ac7ca23e}) = 0
     0.000008 openat(AT_FDCWD, "/sys/fs/btrfs/61cf73bc-b0cc-4667-ad1d-a3c6ac7ca23e/exclusive_operation", O_RDONLY) = 5
     0.000008 lseek(5, 0, SEEK_SET)     = 0
     0.000012 read(5, "balance\n", 32)  = 8
     0.000007 close(5)                  = 0
     0.000006 pselect6(5, NULL, NULL, [4], {tv_sec=60, tv_nsec=0}, NULL) = 1 (except [4], left {tv_sec=59, tv_nsec=999999716})
     0.000008 pselect6(5, NULL, NULL, [4], {tv_sec=29, tv_nsec=999999000}, NULL) = 1 (except [4], left {tv_sec=29, tv_nsec=999998786})
     0.000007 ioctl(3, BTRFS_IOC_FS_INFO, {max_id=7, num_devices=4, fsid=61cf73bc-b0cc-4667-ad1d-a3c6ac7ca23e, nodesize=16384, sectorsize=4096, clone_alignment=4096, flags=BTRFS_FS_INFO_FLAG_GENERATION|BTRFS_FS_INFO_FLAG_METADATA_UUID, generation=346672, metadata_uuid=61cf73bc-b0cc-4667-ad1d-a3c6ac7ca23e}) = 0
     0.000008 openat(AT_FDCWD, "/sys/fs/btrfs/61cf73bc-b0cc-4667-ad1d-a3c6ac7ca23e/exclusive_operation", O_RDONLY) = 5
     0.000007 lseek(5, 0, SEEK_SET)     = 0
     0.000008 read(5, "balance\n", 32)  = 8
     0.000009 close(5)                  = 0
     0.000006 pselect6(5, NULL, NULL, [4], {tv_sec=60, tv_nsec=0}, NULL) = 1 (except [4], left {tv_sec=59, tv_nsec=999999786})
     0.000008 pselect6(5, NULL, NULL, [4], {tv_sec=29, tv_nsec=999999000}, NULL) = 1 (except [4], left {tv_sec=29, tv_nsec=999998784})

It looks like a select which is supposed to block pending a timeout/event is always returning immediately.

This is on Linux 6.6.17 with btrfs-progs 6.7

kdave commented 4 months ago

Right, the time left after returnin from select is 59.999999... and the file descriptor is set in the exceptions set. Using execptfds for sysfs is recommended but the file descriptor needs to be closed and opened again before another select.

kdave commented 4 months ago

There were two problems, fd must be reopened and the data read before doing select. Now fixed in devel, thanks for the report.