unbrice / shake

Shake is a defragmenter that runs in userspace
Other
42 stars 5 forks source link

Improve performance and add reflink support #14

Closed kakra closed 2 years ago

kakra commented 7 years ago

This branch adds a few cleanups first to fix indentation and an issue reported by the compiler. It then adds a series of patches which try to improve performance.

The most important patch is support for the FICLONE ioctl which yields a performance boost of 50% on filesystems supporting it. No thorough testing has been done, it may eat your kitten. Please do your own checks. I tested this against some big files multiple times comparing md5sums before and after and it reported no data damage, so it seems to work.

This builds upon my integration branch.


This change is Reviewable

kakra commented 7 years ago

This should be rebased when FIEMAP support and/or fs hinting was merged... Leaving as-is until @unbrice decided how to do the fs hinting.

unbrice commented 7 years ago

Reviewed 5 of 5 files at r1. Review status: all files reviewed at latest revision, 11 unresolved discussions, some commit checks failed.


executive.c, line 49 at r2 (raw file):

  if (-1 == lseek (in_fd, (off_t) 0, SEEK_SET)
      || -1 == lseek (out_fd, (off_t) 0, SEEK_SET)
      || -1 == ftruncate (out_fd, (off_t) 0))

I suggest you leave this: cost is low but it reduces the preconditions for fcopy() (otherwise callee has to know that out_fd is supposed to be of size 0).


executive.c, line 232 at r2 (raw file):

shake_reg_backup_phase (struct accused *a, struct law *l)
{
  static int can_reflink = 1;

I suggest not to cache things: many people have multiple filesytems and won't be able to take advantage of your work. Calling the ioctl to get -1 shouldn't be so expensive.


executive.c, line 235 at r2 (raw file):

  /* truncate the backup file first, also required for FICLONE to be effective
   */
  int res = ftruncate (l->tmpfd, 0);

Can you add a TODO to use openat(TMP_FILE) instead?


executive.c, line 242 at r2 (raw file):

       * performance
       */
      if (can_reflink && (0 == ioctl (l->tmpfd, FICLONE, a->fd)))

Could you move that FICLONE logic in a separate function in linux.c?


executive.c, line 255 at r2 (raw file):

          /* give the filesystem a relief
           */
          fdatasync (l->tmpfd);

Are you sure this is useful?


executive.c, line 268 at r2 (raw file):

              can_reflink = 0;
              error (0, errno,
                     "%s: FICLONE failed, falling back to normal copy",

I don't think we should show anything to the user, as they are not really in a position to act on it anyways.


executive.c, line 328 at r2 (raw file):

    error (1, errno, "%s: restore failed ! file have been saved at %s",
           a->name, l->tmpname);
  posix_fadvise (a->fd, (off_t) 0, (off_t) 0, POSIX_FADV_DONTNEED);

This was merged already, does the push need to be rebased?


judge.c, line 259 at r2 (raw file):

              }
            else if (z && z->start && labs (z->atime - y->atime) < MAGICTIME)
              y->ideal = (z->start - y->size - MAGICLEAP);

Ouch. Thanks.


linux.c, line 234 at r2 (raw file):

  extmap->fm_length = a->size;
  extmap->fm_flags = FIEMAP_FLAG_SYNC;
  if (-1 != ioctl (a->fd, FS_IOC_FIEMAP, extmap))

I suggest making a function like count_mapped_extents_for_fiemap() that performs the first call with a first struct fiemap on the stack.


linux.c, line 365 at r2 (raw file):

      }
  }
out:

I suggest you create two helper functions, one for fiemap, one for fibmap rather than one huge one with gotos.


linux.h, line 33 at r2 (raw file):


# ifndef FICLONE
#  define FICLONE _IOW(0x94, 9, int) // linux/fs.h since kernel 4.5

AFAIK, this isn't needed in other modules, wouldn't it be better in the .c file?


Comments from Reviewable

kakra commented 7 years ago

I'd like to rework and merge this last, after we resolved some of the other open issues. I'll take some of your suggestions to prepare a few separate PRs and rework some of the open PRs.

kakra commented 6 years ago

Review status: all files reviewed at latest revision, 10 unresolved discussions, some commit checks failed.


executive.c, line 255 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
Are you sure this is useful?

Adding this improves general file system performance while shake is stressing the system. At least on btrfs this feels more responsive. Of course it could be tested tho I'm not sure. Judging file system responsiveness is always subjective. In a non-interactive system it probably reduces speed.


Comments from Reviewable

kakra commented 6 years ago

Review status: all files reviewed at latest revision, 10 unresolved discussions, some commit checks failed.


executive.c, line 49 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
I suggest you leave this: cost is low but it reduces the preconditions for fcopy() (otherwise callee has to know that out_fd is supposed to be of size 0).

I'll refactor it. I had moved it to the callee to have a single point of ftruncate() in preparation of further file copy/clone operations.


executive.c, line 232 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
I suggest not to cache things: many people have multiple filesytems and won't be able to take advantage of your work. Calling the ioctl to get -1 shouldn't be so expensive.

Going to remove it.


executive.c, line 235 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
Can you add a TODO to use openat(TMP_FILE) instead?

Will do.


executive.c, line 242 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
Could you move that FICLONE logic in a separate function in linux.c?

Will do, it'll then more resemble what fcopy() does which makes sense.


executive.c, line 268 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
I don't think we should show anything to the user, as they are not really in a position to act on it anyways.

Going to remove it.


executive.c, line 328 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
This was merged already, does the push need to be rebased?

I'll do a rebase after all discussions are resolved.


Comments from Reviewable

kakra commented 6 years ago

Review status: 3 of 5 files reviewed at latest revision, 11 unresolved discussions, some commit checks failed.


judge.c, line 259 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
Ouch. Thanks.

I don't see the difference here...


linux.h, line 33 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
AFAIK, this isn't needed in other modules, wouldn't it be better in the .c file?

I'm moving it over to executive.c, it's the only user.


Comments from Reviewable

kakra commented 6 years ago

linux.c, line 234 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
I suggest making a function like `count_mapped_extents_for_fiemap()` that performs the first call with a first `struct fiemap` on the stack.

I seem to not get the purpose of your request: This is how this IOCTL works: You call it twice. First call gets the number of extent maps. If you don't want to know more, you're done. If you need the extent maps (and we do), you realloc, tell it how many extent maps you want and run IOCTL again. The counting is thus the first invocation of the IOCTL already. The rest is just walking the allocation mapping on disk.

Of course I could walk the extents one by one and request only one map per loop but this doesn't gain much except possibly a very high number of IOCTL calls. Also, the extents may change while we scan it that way. I don't think that memory is an issue here, even with thousands of extents, the allocation should be comparably small.

The loop would be much smaller if I didn't try to find adjacent extents but due to how btrfs organizes compression, this is needed. Btrfs breaks compressed files into extents of 128k. Big compressed files usually end up in number of extents in the 100k's although disk allocation is actually continuous. If this were different, we would only do one single IOCTL and have the number of "fragments" which is all we need to know. But well, btrfs compression... As a side effect I skip some types of extents which I don't want to count for fragment calculation, i.e. reflinked extents (we don't want to purposely break reflinks) or inlined/tailed extents (makes no sense to defragment).

Maybe I should instead add some comments to explain the above circumstances?


Comments from Reviewable

kakra commented 6 years ago

linux.c, line 365 at r2 (raw file):

Previously, unbrice (Brice Arnould) wrote…
I suggest you create two helper functions, one for fiemap, one for fibmap rather than one huge one with gotos.

That's a good idea but I defer it until later when I have a little more time.


Comments from Reviewable

unbrice commented 2 years ago

Hey :) I'm doing a spring cleanup of old pull requests, please do reopen if you're later interested in resolving the open threads.