unbrice / shake

Shake is a defragmenter that runs in userspace
Other
42 stars 5 forks source link

Provide better filesystem hinting #6

Closed kakra closed 7 years ago

kakra commented 7 years ago

The hinting currently doesn't handle backup and restore phase differently which can be improved.

This patch tells the kernel that the data won't be reused in the copy function, thus it can discard it from the page cache after it was read.

Next, it will instead tell the kernel that the accused file and the tmp file both will be needed for reading next. The first call to WILLNEED should be put up further in the call chain to also support reading the allocation map but this would invole tracking DONTNEED differently, so it is not implemented here.

In the end, the page caches will be discarded explicitly by telling the kernel that the data is no longer needed.

Testing shows that the active page cache during shaking now is released after each file operation and system responsiveness improves during shaking.

This is also part of my integration branch.


This change is Reviewable

kakra commented 7 years ago

This also introduces an important patch to preallocate file space so we don't run out of space during rewrite.

unbrice commented 7 years ago
:lgtm:

Reviewed 1 of 1 files at r1. Review status: all files reviewed at latest revision, 1 unresolved discussion.


Comments from Reviewable

unbrice commented 7 years ago
:lgtm:

Reviewed 1 of 1 files at r2. Review status: all files reviewed at latest revision, 1 unresolved discussion.


executive.c, line 54 at r1 (raw file):

Previously, kakra (Kai Krakow) wrote…
If NOREUSE is ignored, we can just zap it. The man page didn't point that out. But you are using posix style API, not Linux native API, thus on different platforms it may have an effect. I pointed out why I moved WILLNEED to a different place. Especially, it makes more sense in that location when I introduce reflinking later. But digging through your code was a bit messy at times: The structure didn't always make sense to me and the indentation was pretty, well, exotic... ;-) When I did the patch series, I rebased the commits and edited them later to split out preparation patches and split easy from experimental patches: It was a mess at times. Not all of my considerations may hold true later. I was already thinking about giving the whole thing a different structure (see RFE of splitting defrag and judging) and make it more into a producer/consumer algorithm which can work in parallel, e.g. on different partitions, applying different defrag implementations on that way. I absolutely love your judging considerations but I feel very uncomfortable about how defragging is handled: it is very fragile because of the weak locking implementation. I had a lot of headaches with that during my tests, especially when a real-time full-text indexer kicks in (like baloo in KDE): Even reading the file breaks the lock after a few seconds.

Thanks for taking the time to tell me about your greater plan.


Comments from Reviewable