Closed dbramucci closed 4 years ago
It appears to be doing the right thing:
And indeed it's correctly reporting the size of compressed files on ZFS.
@dbramucci - thank you very much for this very detailed issue!! And thanks @Freaky for weighing in.
My understanding was also that the blocks * 512
should solve this. So @dbramucci, do you think this is particular to BTRFS? Could there be another reason for this? Or?
@imsnif It might be particular to BTRFS or extent based filesystems in general. I'll have to try out NTFS later to test another compression supporting, block based filesystem.
Every accurate disk space utility I've seen for BTRFS so far requires sudo
to run.
Which indicates something special goes on with BTRFS.
Looking at the manpage for btrfs-filesystem
under du
(e.g. sudo btrfs filesystem du ~/Downloads
)
Shows that FIEMAP
is used to compute the file sizes.
This makes me think (and here, I'm out of my depth) that this has to do with BTRFS being an extent based filesystem and not a block based filesystem.
That is, BTRFS doesn't keep a list of all fixed sized blocks used for a file but rather, uses a list of variable length intervals (called extents).
This seems particularly relevant given that FIEMAP
appears to stand for FIle Extent MAP.
Likewise, it doesn't use fixed sized blocks meaning the .blocks
api exposed in Rust must be some form of a leaky abstraction.
Unfortunately, I don't understand more about what BTRFS does when forced to describe itself in terms of blocks instead of extents.
https://btrfs.wiki.kernel.org/index.php/Compression#Why_does_not_du_report_the_compressed_size.3F
Why does not du report the compressed size?
Traditionally the UNIX/Linux filesystems did not support compression and there was no item in stat data structure allocated for a similar purpose. There's the file size, that denotes nominal file size independent of the actually allocated size on-disk. For that purpose, the stat.st_blocks item contains a value that corresponds to the number of blocks allocated, i.e. in case of sparse files. However, when a compression is involved, the actually allocated size may be smaller than nominal, although the file is not sparse.
There are utilities that determine sparseness of a file by comparing the nominal and block-allocated size, this behaviour might cause bugs if st_blocks contained the amount after compression.
Another issue with backward compatibility is that up to now st_blocks always contains the uncompressed number of blocks. It's unclear what would happen if there are files with mixed types of the value. The proposed solution is to add another special call for that (via ioctl), but this may be not the ideal solution.
There is a fiemap crate, in principle I could tie it into filesize
, but it would be behind an off-by-default feature flag, because it's both complex and looks eyewateringly slow.
Added as issue #1.
@dbramucci - it seems to me the discussion brought us to the issue @Freaky opened in filesize
. I think we can close this and address it upstream... or am I forgetting/missing something?
Seems good to me, the only question remaining would be if a UI for virtual vs physical filesizes would be supported but I think that can be it's own issue and should be guided by some user stories.
For sure. It sounds like an interesting feature, would be happy to look further into it when we know more and/or feel the need.
When running diskonaut on a BTRFS filesystem with compression enabled, it shows the uncompressed space used by folders and files, not the actual disk-space used.
One folder of mine using 69.5G of storage, but if I delete this folder I would not regain 69.5G worth of disk space because that folder is being compressed instead I would only regain 50G of space which represents the actual space used on the disk.
The command [
sudo compsize /path/to/folder
] was able to identify the post-compression space used.Rationale for feature:
If I am using this tool, I am likely trying to free space so that I may allocate a new file.
Suppose I want to download a 4GiB iso image. If I have a 4.5GiB zip archive and a 5GiB text-file,
diskonaut
would make it appear that deleting the text-file would let me download the iso with a GiB to spare. Unfortunately, with compression enabled, the compressible zip archive would still free up 4.5GiB while the highly compressible text-file may only free a 900MiB. At that point I would download the iso, run out of space and then have to reopendiskonaut
to free ? more GiB (and hope that compression doesn't cause more trouble).Design Questions
How should the compressed vs uncompressed space be represented in the UI?
The uncompressed usage may still be useful if I plan on copying my files to a location without compression.
69.5G???
would be displayed above) until the slower more accurate run occurs.69.5G???
earlier but maybe<69.5G>
or some other formatting makes more sense.Relevant Filesystems
This Wikipedia table of filesystem capabilities shows that the following support compression.