Closed asyncth closed 2 years ago
compsize
reports the count of extents, be they contiguous on the disk or not.
The data returned by BTRFS_IOC_TREE_SEARCH_V2
includes information about how the extents are placed inside btrfs' linear address space, it would be possible to count and display that. The linear address space doesn't reflect the physical placement even on a single disk, but as block group sizes are in the gigabytes, the inaccuracy is ok for most purposes.
Hmm, the data doesn't include enough information for inline extents. They tend to be the only extent in a file, though.
If you want frag count to be added, please tell me:
Inline extents are always separate fragments. They are stored in metadata blocks so they can never be contiguous with any data block. It is possible for a file to have an inline extent and regular extents, but there must be a hole in between of at least one byte (if there isn't, then it gets written as a normal extent).
Why would a file in one piece count as 2 fragments?
Duh, I meant whether a file in two pieces counts as 1 or 2. Ie, a count of discontiguous blocks vs a measure of unoptimal fragmentation only.
(that's mostly asking what colour to paint a bike shed, but I can't decide 😉)
OK, two pieces is a more sensible question ;)
[edit to match the behavior of filefrag discovered below]
Two extent references A and B are contiguous if the logical end of A is the same as the logical start of B, and the physical end of A is the same as the physical start of B. If there's a hole between A and B, then use the hole's logical end instead of A's logical end (i.e. skip over the hole without breaking contiguity).
So these are discontiguous:
That would most closely match filefrag
behavior, except it would account for the compressed and uncompressed length of a compressed extent when calculating contiguity.
There are some corner cases, like what if extent B is overwriting part of extent A in the middle? That could look like:
In that case I'd count it as 3 fragments (and so does filefrag
) since a naive sequential read would seek 3 times as it reads the first part of A, all of extent B, then the second part of A. Note that compsize
would count that as 2 extents and 3 refs.
I'd count a hole as contiguous — anyone reading the file would request extent B immediately after A, as far as disk head movements go.
true, but filefrag doesn't count them that way.
welp, I'm wrong about that...
# > test ; dd bs=4k seek=10 count=8 if=/boot/vmlinuz conv=notrunc of=test; dd bs=4k seek=100 count=8 if=/boot/vmlinuz conv=notrunc of=test; sync -f .; filefrag -v test
8+0 records in
8+0 records out
32768 bytes (33 kB, 32 KiB) copied, 0.000127871 s, 256 MB/s
8+0 records in
8+0 records out
32768 bytes (33 kB, 32 KiB) copied, 7.809e-05 s, 420 MB/s
Filesystem type is: 9123683e
File size of test is 442368 (108 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 10.. 17: 1924070403..1924070410: 8: 10:
1: 100.. 107: 1924070411..1924070418: 8: last,eof
test: 1 extent found
The extent total counts the physical blocks on both sides of the hole as one extent for the total number of extents, but it writes two separate extent records in the -v
output.
If you want frag count to be added
Not really, just wanted to confirm if compsize
can be used to tell compressed file's fragmentation. Definitely wouldn't mind it though.
Ran the latest commit on a freshly defragmented file, the number of fragments seems to match the number of extents, does this mean that they're not contiguous?
In this particular case, it only means that I'm an idiot :/ Please pull for a less buggy version.
It's still an initial stab that doesn't understand partial extents.
Pulled, it works, but shows that a file has about 7x more fragments on SSD than a file with the exact same contents on HDD, even after defragging both of them (the file was copied from HDD to SSD). Probably not a bug, but some kind of behavior that Btrfs has?
I guess this can be closed now?
There is an entry in the FAQ list of Fedora's Btrfs compression initiative which says that when using
filefrag
on a compressed file, some of the reported extents are potentially in reality contiguous on the disk, which means thatfilefrag
is not a reliable way to tell file's fragmentation.I've noticed that
compsize
also reports much less extents for uncompressed files than for compressed files, which means thatcompsize
is likely also affected by the issue above, but I just wanted to confirm anyway.