nachoparker / btrfs-du

Easily print BTRFS subvolume/snapshot disk usage
GNU General Public License v3.0
112 stars 17 forks source link

Explain why Total exclusive data less than Total size of data? #20

Open MurzNN opened 3 years ago

MurzNN commented 3 years ago

Here is quote from readme:

# btrfs-du /home

Subvolume                                                         Total  Exclusive  ID
─────────────────────────────────────────────────────────────────────────────────────────
.snapshot2017-08-09                                            75.77GiB  105.95MiB  343
.snapshot2017-08-10                                            75.78GiB  103.46MiB  346
.snapshot2017-08-11                                            75.78GiB  514.47MiB  347
.snapshot2017-08-23                                            76.07GiB  568.11MiB  348
.snapshot2017-09-05                                            76.66GiB  648.50MiB  349
.snapshot2017-10-11                                            63.51GiB    2.71GiB  391
.snapshot2017-11-13                                            62.78GiB    1.09GiB  392
.snapshot2017-11-29                                            63.40GiB  974.05MiB  410
.snapshot2017-12-11                                            64.21GiB  682.08MiB  455
─────────────────────────────────────────────────────────────────────────────────────────
Total exclusive data                                                            7.32GiB

Can you please explain, why "Total exclusive data" is such small, less than "Total" of any snapshot?

As I understand, "Exclusive" shows the amount of unique data, that stored on current snapshot, yes?

So, at least one of all snapshots must have an unique storage for files, that other snapshots can "reuse", if it is not changed.

As result, the "Total exclusive data" sum can't be less, that minimum "Total" value of all snapshots. But in the example (and in my system) it is less.

Can you please explain why this happens, and how to calculate real stored size of specific subvolume? Thanks!

MurzNN commented 3 years ago

Example of problem from my real server - here is output of btrfs-du:

k# btrfs-du /mnt/btrfs
Subvolume                                                         Total  Exclusive  ID        
─────────────────────────────────────────────────────────────────────────────────────────
brick-files                                                     9.73GiB  448.00KiB  257       
brick-mysql                                                    10.75GiB   48.00KiB  258       
brick-mysql/.snapshots                                         16.00KiB   16.00KiB  302       
brick-mysql/.snapshots/1/snapshot                              10.61GiB   16.00KiB  303       
brick-mysql/.snapshots/2/snapshot                              10.61GiB   16.00KiB  304       
brick-files/.snapshots                                         16.00KiB   16.00KiB  305       
brick-files/.snapshots/1/snapshot                               9.72GiB  608.00KiB  312       
brick-mysql/.snapshots/5/snapshot                              10.75GiB   48.00KiB  336       
brick-files/.snapshots/3/snapshot                               9.73GiB  176.00KiB  337       
─────────────────────────────────────────────────────────────────────────────────────────
Total exclusive data                                                          528.00KiB

So, regarding to this report, total used space on my /mnt/btrfs drive (with two subvolumes) must be about 20 GiB, calculated via formula: averageTotalSize(all brick-mysql subvolumes) + averageTotalSize(all brick-files subvolumes) + Total exclusive data: 10.68 Gib + 9.73 GiB + 0.000528 GiB = 20.41 Gib

But the df command shows me:

# df -h /mnt/btrfs
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda2       85G   32G   52G  39% /mnt/btrfs

So, the main question - who eats additional 12Gb (32 - 20) of used space in this btrfs volume?

mritzmann commented 3 years ago

Can you please explain, why "Total exclusive data" is such small, less than "Total" of any snapshot?

As I understood it, Total exclusive data is the sum of the Exclusive column.

105+103+514+568+648+2710+1090+974+682 = 7394 = 7.32GiB

who eats additional 12Gb (32 - 20) of used space in this btrfs volume?

A file that is included in more than one snapshot will not be displayed as Exclusive. And is not shown anywhere else in the Btrfs quota function. That's why your question cannot be answered with BTRFS own tools. As Example to Reproduce this Problem:

df -Th /
fallocate -l 10G 10g.img
btrfs subvolume snapshot -r / /snapshot/1
btrfs subvolume snapshot -r / /snapshot/2
rm 10g.img
df -Th /

Explanation of my example:

That is also the reason why I find the total of btrfs-du a bit misleading. The Exclusive per snapshot shows how much storage space would be freed up if you delete the one snapshot. In this sense it makes sense. But you can't deduce the actual space consumed by all btrfs subvolumes from the total. And as I said: This is not possible with BTRFS own tools, so btfs-du can't change anything about it.

But maybe something like this will help you? https://github.com/CyberShadow/btdu (the tool has to look at the files one by one, so it can take a while until the result makes sense, instead of use btrfs qgroup)

MurzNN commented 3 years ago

@mritzmann Thanks for so detailed explanation! https://github.com/CyberShadow/btdu and https://github.com/rkapl/btsdu are good solutions for better understanding used space in BTRFS subvolume.

ruliane commented 1 year ago

Can you please explain, why "Total exclusive data" is such small, less than "Total" of any snapshot?

As I understood it, Total exclusive data is the sum of the Exclusive column.

105+103+514+568+648+2710+1090+974+682 = 7394 = 7.32GiB

However in @MurzNN's 2nd example, the sum of Exclusive column is different that Total exclusive data: 448.00 + 48.00 + 16.00+ 16.00 + 16.00 + 608.00 + 48.00 + 176.00 = 1376KiB

Looks like I patched my version some time ago, but I didn't push this change here. I should do it. EDIT : Already requested here: https://github.com/nachoparker/btrfs-du/pull/19. I think this issue can be closed now.