Zygo / bees

Best-Effort Extent-Same, a btrfs dedupe agent
GNU General Public License v3.0
689 stars 56 forks source link

Getting frequent exception type std::system_error: btrfs-extent-same: BtrfsExtentSame {...} at fs.cc:181: Invalid argument #212

Closed farblos closed 2 years ago

farblos commented 2 years ago

Kernel (running on ARM 64bit):

[opc@sbptest bees-0.7]$ uname -a
Linux sbptest 5.4.17-2136.301.1.3.el7uek.aarch64 #2 SMP Mon Nov 29 21:11:59 PST 2021 aarch64 aarch64 aarch64 GNU/Linux

Bees: 0.7 running through beesd and systemd unit.

Distribution: Oracle Linux Server release 7.9

During a test done with cpio while bees is running I frequently get the following exceptions in the journal:

Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: --- BEGIN TRACE --- exception ---
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: Found matching range: BeesRangePair: 32K src[0x2000..0xa000] dst[0x2000..0xa000]
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: src = 259 /run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/hp_ux_ia64/p30432118_190000_HPUX-IA64.zip
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: dst = 286 /run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/aix/p30432118_190000_AIX64-5L.zip
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: creating brp (4K [0x2000..0x3000] fid = 5:1671 fd = 259 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/hp_ux_ia64/p30432118_190000_HPUX-IA64.zip', 4K [0x2000..0x3000] fid = 5:1689 fd = 286 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/aix/p30432118_190000_AIX64-5L.zip')
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: Opening src bfr 4K [0x2000..0x3000] fid = 5:1671 fd = 259 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/hp_ux_ia64/p30432118_190000_HPUX-IA64.zip'
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: chase_extent_ref ino BtrfsInodeOffsetRoot { .m_inum = 1671, .m_offset = 0x2000, .m_root = 5 } bbd BeesBlockData { 4K 0x2000 fd = 286 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/aix/p30432118_190000_AIX64-5L.zip', data[4096] }
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: ino_off_root BtrfsInodeOffsetRoot { .m_inum = 1671, .m_offset = 0x2000, .m_root = 5 }
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: for_each_extent_ref BeesBlockData { 4K 0x2000 fd = 286 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/aix/p30432118_190000_AIX64-5L.zip', data[4096] } at 0x200da2000: 2 found
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: overlap_bfr 0 (..0x0]
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: Opening dst bfr 4K [0x2000..0x3000] fid = 5:1689 fd = 286 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/aix/p30432118_190000_AIX64-5L.zip'
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: replace_dst dst_bfr 4K [0x2000..0x3000] fid = 5:1689 fd = 286 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/aix/p30432118_190000_AIX64-5L.zip'
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: finding one match (out of 2) at 0x200da2000 for BeesBlockData { 4K 0x2000 fd = 286 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/aix/p30432118_190000_AIX64-5L.zip', address = 0x33f082000, hash = 0x634e8b0fff9832bf, data[4096] }
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: resolved_addrs.size() = 1
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: Pushing hash 0x634e8b0fff9832bf addr 0x33f082000 bbd BeesBlockData { 4K 0x2000 fd = 286 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/aix/p30432118_190000_AIX64-5L.zip', address = 0x33f082000, hash = 0x634e8b0fff9832bf, data[4096] }
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: scan bbd BeesBlockData { 4K 0x2000 fd = 286 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/aix/p30432118_190000_AIX64-5L.zip', address = 0x33f082000, hash = 0x634e8b0fff9832bf, data[4096] }
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: Extent { begin = 0x0, end = 0xbc5c3, physical = 0x33f080000, flags = FIEMAP_EXTENT_LAST, physical_len = 0xc0000, logical_len = 0xc0000 } block_count 189
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: scan extent Extent { begin = 0x0, end = 0xbc5c3, physical = 0x33f080000, flags = FIEMAP_EXTENT_LAST, physical_len = 0xc0000, logical_len = 0xc0000 }
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]:
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]:
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: *** EXCEPTION ***
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]:         exception type std::system_error: btrfs-extent-same: BtrfsExtentSame { .m_fd = 259 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/hp_ux_ia64/p30432118_190000_HPUX-IA64.zip', .logical_offset = 0x2000, .length = 0x8000, .dest_count = 1, .reserved1 = 0, .reserved2 = 0, .info[] = { [0] = btrfs_ioctl_same_extent_info { .fd = 286 '/run/bees/mnt/c955d7b3-5d68-4c09-b9d8-284bc27fd23d/test/mntsrc/prerelease/202111:1/sbpgi/19.0.0.0.0/aix/p30432118_190000_AIX64-5L.zip', .logical_offset = 0x2000, .bytes_deduped = 0x0, .status = -22 (Invalid argument), .reserved = 0 }, } at fs.cc:181: Invalid argument
Jan 14 20:13:21 sbptest beesd[13487]: crawl_5[13528]: ***
farblos commented 2 years ago

I have seen #58 but that has been closed some 3 years ago. Plus I'm running on a "recommended kernel" according to the readme ...

Zygo commented 2 years ago

Can you provide output from btrfs-search-metadata file for both of the filenames mentioned in the BtrfsExtentSame arguments? The clue is probably in there somewhere.

In general, there's a number of reasons why extent-same might return EINVAL, and some of them are relatively rare and require more work to detect than simply trying the extent-same call and ignoring the error. Assuming bees isn't doing something very wrong here, we could simply ignore them, or push them down to a more verbose log level.

farblos commented 2 years ago

Sry, my testbed is gone already - I have migrated to a "competitor". Not because of this issue, just because it fitted my needs better.

Thanks anyway, closing this issue.