g2p / bedup

Btrfs deduplication
http://pypi.python.org/pypi/bedup
GNU General Public License v2.0
323 stars 50 forks source link

You need to run this command as root when kernel is 5.0 #101

Open bartekuk opened 5 years ago

bartekuk commented 5 years ago

Hi

While running sudo bedup dedup --no-crossvol on FC28 linux 5.0.5-100.fc28.x86_64 I get the "need to run this command as root" on any deduplication attempt. Running linux <=4.20 restores correct behaviour, deduplication works fine again.

I only guess this indicates some kernel (ioctl?) incompatibility in 5?.

https://btrfs.wiki.kernel.org/index.php/Changelog offers no explanation (no pull requests past 4.20)

Thanks for great utility btw! I will update if I find out anything new

markfinn commented 5 years ago

Same here on ubuntu 19.04 beta.

andrwp commented 5 years ago

Same on Arch Linux with 5.0.9 kernel.

veganvelociraptor commented 5 years ago

I can confirm this using the 5.0 and 5.1 kernel. Running bedup using sudo su doesn't help.

deatheibon commented 5 years ago

Its regarding to the ioctl changes comig with this kernels. https://lkml.org/lkml/2019/1/28/1930

veganvelociraptor commented 5 years ago

Its regarding to the ioctl changes comig with this kernels. https://lkml.org/lkml/2019/1/28/1930

That was 4 months ago. Are you saying that's the reason why it's not fixed, and that it's not fixable?

deatheibon commented 5 years ago

Its the reason why its not working with Kernel 5+ i switched back to 4.20 and its working. I think it can be fixed but then its maybe not working anymore for Kernel below 5.

Zygo commented 4 years ago

I tried to reproduce this on 5.4.42, and got the following:

00.01 Scanned 2214 retained 0
Deduplicating volume /tester/
Deduplicating volume /tester/current
00.67 Size group 1/18 (68821971) sampled 2 hashed 0 freed 0
Traceback (most recent call last):
  File "/home/tester/bin/bedup", line 10, in <module>
    sys.exit(script_main())
  File "/home/tester/.local/lib/python3.7/site-packages/bedup/__main__.py", line 497, in script_main
    sys.exit(main(sys.argv))
  File "/home/tester/.local/lib/python3.7/site-packages/bedup/__main__.py", line 486, in main
    return args.action(args)
  File "/home/tester/.local/lib/python3.7/site-packages/bedup/__main__.py", line 192, in vol_cmd
    dedup_tracked(sess, [vol], tt, defrag=args.defrag)
  File "/home/tester/.local/lib/python3.7/site-packages/bedup/tracking.py", line 405, in dedup_tracked
    dedup_tracked1(ds, comm1)
  File "/home/tester/.local/lib/python3.7/site-packages/bedup/tracking.py", line 573, in dedup_tracked1
    if fd in immutability.fds_in_write_use:
  File "/home/tester/.local/lib/python3.7/site-packages/bedup/dedup.py", line 256, in fds_in_write_use
    self.__require_use_info()
  File "/home/tester/.local/lib/python3.7/site-packages/bedup/dedup.py", line 242, in __require_use_info
    for (fd, use_info) in find_inodes_in_write_use(self.__fds):
  File "/home/tester/.local/lib/python3.7/site-packages/bedup/dedup.py", line 115, in find_inodes_in_write_use
    for (fd, use_info) in find_inodes_in_use(fds):
  File "/home/tester/.local/lib/python3.7/site-packages/bedup/dedup.py", line 161, in find_inodes_in_use
    for proc_path, st_id in st_id_candidates(glob.glob('/proc/[1-9]*/fd/*')):
  File "/home/tester/.local/lib/python3.7/site-packages/bedup/dedup.py", line 145, in st_id_candidates
    st = os.stat(proc_path)
PermissionError: [Errno 13] Permission denied: '/proc/21955/fd/255'

The call stack find_inodes_in_use > find_inodes_in_write_use > ... > fds_in_write_use is trying to see if the inodes are in use, and I dunno, maybe in kernel 5.0 there were some new rules about access to /proc/*/fd/* especially if namespaces are involved (/proc/21955/fd/255 is in a different container)--but the real issue here is that bedup should not be looking in /proc at all.

Modern (i.e. after 2013) dedupe on btrfs does not have to know or care whether files are in use. There is no need to fix any of these functions--just delete them (or at least ignore errors that occur in them), and stop using the CLONE_RANGE ioctl. This also applies to #96.

If you have root, you can open any file in read-only mode and call BTRFS_IOC_FILE_EXTENT_SAME at any time, without data corruption risk due to concurrent file updates. All bedup branches on github except one still use the CLONE_RANGE ioctl instead. EXTENT_SAME was introduced in kernel 3.13. Maybe it's time to start using it.

There is a branch wip/dedup-syscall in bedup which does use BTRFS_IOC_FILE_EXTENT_SAME; however, this branch no longer cleanly merges with the master branch.

https://lkml.org/lkml/2019/1/28/1930

This is a patch that introduces zstd compression level support. I don't see any connection with either bedup or kernel permission checking rules.