df result mismatch - Githubissues

Thomas-Tsai commented 12 years ago

Hi,

We are clonezilla/partclone developer, and try to use library of vmfs to clone full vmfs partition. Basically, partclone will check bitmap and find all used block which will be backup. Lately, we found some (vmfs5) clone/restore failure, and try to do more debug. I found a strange result of df from debugvmfs and esxi :

ESXi: ~ # df Filesystem Bytes Used Available Use% Mounted on VMFS-5 37580963840 3346006016 34234957824 9% /vmfs/volumes/datastore1 df -m Filesystem 1M-blocks Used Available Use% Mounted on VMFS-5 35840 3191 32649 9% /vmfs/volumes/datastore1

vmfs-tools/debugvmfs ./debugvmfs/debugvmfs /dev/sdb3 df Block size : 1048576 bytes Total blocks : 35840 Total size : 35840 MiB Allocated blocks : 2990 Allocated space : 2990 MiB Free blocks : 32850 Free size : 32850 MiB

The df show me total block(35840) is right, but Allocated blocks/space are different(3191>2990)!

Partclone based on vmfs-tools, and only backup 2990 blocks, so get clone/restore failure. We try to follow vmfs-fsck .c and debugvmfs.c to dump all blk_id, position and size from bitmap (fbb, fdc, pbc and sbc), still get same result.

We are confuse for the situation, could you check why they are different?

I also do some study from VMware KB and found the new bitmap file named .pb2.sf, it is...

new PB2 system file, for double indirect PB file capability http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004223 I don't know it's same issue or not, but i can't check any function about pb2 from vmfs-tools. please help to found really problem, thank you.

glandium commented 12 years ago

vmfs_fuse_statfs and cmd_df use the same functions and structs to get the data to report, so i really don't see why that would be different.

As for .pb2.sf, my hypothesis was that it was used for doubly indirect block lists, but I've never managed to have a vmfs5 fs actually using it: doubly indirect block lists were always in .pb.sf.

Thomas-Tsai commented 12 years ago

exactly, vmfs_fuse_statfs and cmd_df are same! My issue is different df result between "VMware ESXi5 Server shell" and linux vmfs-tools.

There are my steps to get different df result: I install VMware ESXi5 Server with shell and ssh enabled, copy some files to ESXi5 Server datastore1(vmfs5) by scp. After file copied, login to ESXi5 Server and run df -m.

reboot ESXi5 Server and boot from linux LiveCD(I use clonezilla-live), get vmfs-tools source from git repository and build it, run debugvmfs /dev/sda3 df.

The used data is smaller then ESXi server, I guess some used blocks are missing ?!

glandium commented 12 years ago

Ah, I see. Does that happen on vmfs3 as well or is it limited to vmfs5? Can you try on a zeroed-out partition, and take an image with the imager tool from vmfs-tools (imager /dev/device > image.file) and put the file somewhere I can pick it?

glandium commented 12 years ago

Note that you can ping me on irc (oftc or mozilla, nick: glandium)

glandium commented 12 years ago

Sorry it took so long.

I finally took a look at your image, and all I can say is that esx itself seems to be wrong. There is nothing in the data structures of the file system that would indicate more than 2990 blocks allocated.

Here's a little VMFS 101: (for vmfs 3, but vmfs 5 shouldn't be very different)

there are four bitmaps: fdc (file descriptors), fbb (file blocks), sbc (sub blocks), pbc (pointer blocks).
the primary bitmap is fbb, it describes all the block_size blocks in the file system. Even blocks from the file system metadata itself are described there. block_size is 1MiB in your image.
sbc may be used to store 8KiB (in your image) chunks of small files. There are no such files in your image, but even if there were, the sbc itself, which contains the bitmap and the chunks is counted as allocated. So if you save all allocated blocks from the fbb, you are also saving the sbc content.
fdc contains inodes. Each inode is 2KiB. Like sbc, if you save all allocated blocks from the fbb, you are also saving the fdc content.
pbc contains pointer blocks. These are blocks which content is pointers to other blocks. An inode contains 1KiB worth of 32-bits pointers, which allows to point 256 blocks. Which means an inode can only directly point to 256MiB. To address files bigger than that, pointer blocks are used: the inode points to a pointer block, and a pointer block contains 1MiB worth of 32-bits pointers. Anyways, as always, the data from this bitmap is allocated under fbb, so if you save all allocated blocks from the fbb, you also save pbc.
In your image, there are 13 file descriptors in use, one for each of the following files/directories: ., .fbb.sf, .fdc.sf, .pbc.sf, .sbc.sf, .vh.sf, .pb2.sf, clonezilla-live-1.2.11-30-amd64.iso, clonezilla-live-1.2.11-30-i686-pae.iso, linuxmint-12-gnome-cd-nocodecs-32bit.iso, ubuntu-11.10-desktop-amd64.iso, ubuntu-11.10-desktop-i386.iso. If you count, that's 12. There's a 13th inode used in the fdc bitmap, but it's content is empty. The mentioned files/directories each allocate from fbb a number of 1MiB blocks, respectively: 1, 1, 255, 256, 251, 4, 2, 103, 102, 621, 698, 696. The sum of all this is 2990.

So from the above, I don't see why ESX would believe there is 210MB more allocated. I was ready to say that maybe it's also adding the subblocks, but there aren't any in use. I'm tempted to say ESX is wrong, here.

Now, that doesn't really help know why your backup restore fails, but I doubt it fails with the image you gave me for this issue. I'll thus close.

Thomas-Tsai commented 12 years ago

got it. I try to re-write new back tool, almost update from vmfs-fsck.c and follow to dump each bitmap. I think it cloud work fine, it's still in test.

Thank You!~

glandium / vmfs-tools

df result mismatch #5