g2p / bedup

Btrfs deduplication
http://pypi.python.org/pypi/bedup
GNU General Public License v2.0
322 stars 50 forks source link

IOError: [Errno 22] Invalid argument #61

Closed tobiasstein closed 9 years ago

tobiasstein commented 9 years ago

bedup crashes with:

root@infinitas /.snapshots # python -m bedup dedup --defrag /media/btrfs_space
Skipped 188 frozen volumes in filesystem <btrfs_space>
Not scanning /media/btrfs_space, generation is still 27007
Not scanning /media/btrfs_space/steam, generation is still 27007
Not scanning /media/btrfs_space/libvirt, generation is still 27000
Not scanning /media/btrfs_space/libvirt/.snapshots, generation is still 27003
Not scanning /media/btrfs_space/srv, generation is still 27007
Not scanning /media/btrfs_space/srv/.snapshots, generation is still 27003
Scanning volume /media/btrfs_space/home generations from 27022 to 27023, with size cutoff 8388608
00.00 Scanned  retained 0/usr/local/lib/python2.7/dist-packages/cffi/vengine_cpy.py:188: UserWarning: reimporting '_cffi__xaa446cfbxe94aa841' might overwrite older definitions
  % (self.verifier.get_module_name()))
00.02 Scanned 5212 retained 0
Not scanning /media/btrfs_space/home/.snapshots, generation is still 27003
Not scanning /media/btrfs_space/log, generation is still 27015
Not scanning /media/btrfs_space/log/.snapshots, generation is still 27003
Deduplicating filesystem <btrfs_space>
51.79 Size group 44/1367 sampled 105 hashed 25 freed 0
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/bedup/__main__.py", line 487, in <module>
    script_main()
  File "/usr/local/lib/python2.7/dist-packages/bedup/__main__.py", line 483, in script_main
    sys.exit(main(sys.argv))
  File "/usr/local/lib/python2.7/dist-packages/bedup/__main__.py", line 472, in main
    return args.action(args)
  File "/usr/local/lib/python2.7/dist-packages/bedup/__main__.py", line 198, in vol_cmd
    dedup_tracked(sess, volset, tt, defrag=args.defrag)
  File "/usr/local/lib/python2.7/dist-packages/bedup/tracking.py", line 394, in dedup_tracked
    dedup_tracked1(ds, ofile_reserved, query)
  File "/usr/local/lib/python2.7/dist-packages/bedup/tracking.py", line 585, in dedup_tracked1
    dedup_fileset(ds, fileset, fd_names, fd_inodes, size)
  File "/usr/local/lib/python2.7/dist-packages/bedup/tracking.py", line 608, in dedup_fileset
    if clone_data(dest=dfd, src=sfd, check_first=True):
  File "/usr/local/lib/python2.7/dist-packages/bedup/platform/btrfs.py", line 601, in clone_data
    ioctl_pybug(dest, lib.BTRFS_IOC_CLONE, src)
  File "/usr/local/lib/python2.7/dist-packages/bedup/platform/btrfs.py", line 360, in ioctl_pybug
    return fcntl.ioctl(fd, ioc, arg)
IOError: [Errno 22] Invalid argument
python -m bedup dedup --defrag /media/btrfs_space  48,46s user 3,05s system 97% cpu 52,957 total
olifre commented 9 years ago

Looks similar to #15 ... Apparently this can also happen if one of the files which are subjected to the clone() syscall are marked as nodatacow (lsattr shows C). I think bedup should then skip these files from deduplication instead of erroring out... Could this be fixed, @g2p ?

You could try: strace -e openat bedup dedup |& grep -v AT_FDCWD to see which files are creating the problem, and then run lsattr to prove my theory.

For me this happens on many machines since I have /tmp as nodatacow to make file deletion there a bit faster.