diskfs / go-diskfs

MIT License
515 stars 116 forks source link

Can't read ext4 partition... #231

Closed Buanderie closed 4 months ago

Buanderie commented 4 months ago

...While using disk.GetFileSystem()

This is the error I get:

DISK: /dev/nvme0n1 PART: 8
DEBU[0000] initDisk(): start
DEBU[0000] initDisk(): block device
DEBU[0000] initDisk(): logical block size 512, physical block size 512
DEBU[0000] trying fat32
DEBU[0000] fat32 failed: error reading MS-DOS Boot Sector: could not read FAT32 BIOS Parameter Block from boot sector: could not read embedded DOS 3.31 BPB: error reading embedded DOS 2.0 BPB: invalid sector size 0 provided in DOS 2.0 BPB. Must be 512
DEBU[0000] trying iso9660 with physical block size 512
DEBU[0000] iso9660 failed: blocksize for ISO9660 must be one of 2048, 4096, 8192
DEBU[0000] trying ext4
DEBU[0000] ext4 failed: could not interpret Group Descriptor Table data: error creating group descriptor from bytes: checksum mismatch, passed 0, actual ad2c
2024/06/26 23:40:56 unknown filesystem on partition 8
panic: unknown filesystem on partition 8

The partition is freshly reformatted using mkfs.ext4, with no additional options.

The go-diskfs version I use is the latest commit 706241449bae6ba526dffef7d35b2e5a5613522b

The function I use is this one:

func readPartition( diskPath string, partIdx int ) error {
    disk, err := diskfs.Open(diskPath)
    if err != nil {
        return err
    }
    defer disk.Close();

    fs, err := disk.GetFilesystem(partIdx)
    if err != nil {
        log.Panic(err)
    }
    files, err := fs.ReadDir("/") // this should list everything
    if err != nil {
        log.Panic(err)
    }
    fmt.Println(files)
    return nil
}

The partition I'm trying to read is on a nvme SSD, partition 8 on a GPT table. The function call is as follow: readPartition( "/dev/nvme0n1", 8 )

Also, tune2fs -l /dev/nvme0n1p8 gives:

tune2fs 1.46.5 (30-Dec-2021)
Filesystem volume name:   <none>
Last mounted on:          <not available>
Filesystem UUID:          8fcec880-f105-4923-b429-e6c6eb16ffc2
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
Filesystem flags:         signed_directory_hash
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              1835008
Block count:              7331072
Reserved block count:     366553
Overhead clusters:        159223
Free blocks:              7171843
Free inodes:              1834997
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Wed Jun 26 23:33:24 2024
Last mount time:          n/a
Last write time:          Wed Jun 26 23:33:24 2024
Mount count:              0
Maximum mount count:      -1
Last checked:             Wed Jun 26 23:33:24 2024
Check interval:           0 (<none>)
Lifetime writes:          4174 kB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:              256
Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      acbf9c6b-2ba9-458b-8ecf-b4596b2c8974
Journal backup:           inode blocks
Checksum type:            crc32c
Checksum:                 0x161dbd42

I know ext4 support is a fairly recent addition, so maybe I'm running into something new, here...

deitch commented 4 months ago

Very new, just merged last week. Probably some bugs in it.

Interesting that the checksum on the GDT failed, or specifically one of the group descriptors. If you want to step through it using a debugger, put a breakpoint at groupdescriptors.go:99, and see what group descriptor it is that failed (variable i represents the count). Then look at the actual group descriptors reported by debugfs -R stats /dev/nvme0n1p8, which should give all of the group descriptors.

Buanderie commented 4 months ago

Using gdb:

Thread 1 "init" hit Breakpoint 1, github.com/diskfs/go-diskfs/filesystem/ext4.groupDescriptorsFromBytes (b=..., gdSize=64, hashSeed=4042148468, checksumType=2 '\002', ~r0=<optimized out>, ~r1=...) at /home/nis/go/pkg/mod/github.com/diskfs/go-diskfs@v1.4.1-0.20240616082037-706241449bae/filesystem/ext4/groupdescriptors.go:99
99             return nil, fmt.Errorf("error creating group descriptor from bytes: %w", err)
(gdb) info locals
end = <optimized out>
gd = <optimized out>
start = <optimized out>
err = <optimized out>
i = 0
&gds = 0xc0000126a8
count = 223
gdSlice = {array = 0xc0000ec000, len = 0, cap = 10}

This is group 0, right ?

debugfs -R stats /dev/nvme0n1p8
debugfs 1.46.5 (30-Dec-2021)
Filesystem volume name:   <none>
Last mounted on:          /tmp/disk-mount-3045171502
Filesystem UUID:          8fcec880-f105-4923-b429-e6c6eb16ffc2
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
Filesystem flags:         signed_directory_hash
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              1835008
Block count:              7331072
Reserved block count:     366553
Overhead clusters:        159223
Free blocks:              6904461
Free inodes:              1834995
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Wed Jun 26 23:33:24 2024
Last mount time:          Thu Jun 27 00:42:42 2024
Last write time:          Thu Jun 27 00:42:43 2024
Mount count:              10
Maximum mount count:      -1
Last checked:             Wed Jun 26 23:33:24 2024
Check interval:           0 (<none>)
Lifetime writes:          1497 MB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      acbf9c6b-2ba9-458b-8ecf-b4596b2c8974
Journal backup:           inode blocks
Checksum type:            crc32c
Checksum:                 0xfdda35de
Directories:              2
 Group  0: block bitmap at 1029, inode bitmap at 1045, inode table at 1061
           23509 free blocks, 8179 free inodes, 2 used directories, 8179 unused inodes
           [Checksum 0x42b1]
 Group  1: block bitmap at 1030, inode bitmap at 1046, inode table at 1573
           31737 free blocks, 8192 free inodes, 0 used directories, 8192 unused inodes
           [Inode not init, Checksum 0x4277]
 Group  2: block bitmap at 1031, inode bitmap at 1047, inode table at 2085
           32768 free blocks, 8192 free inodes, 0 used directories, 8192 unused inodes
           [Inode not init, Block not init, Checksum 0x72f0]
deitch commented 4 months ago

This is group 0, right ? i = 0

Looks like it. I wonder why it is having a hard time with it. I suspect it is not the checksum itself, but that it is misreading something about where the GDT actually is, and is reading the wrong bytes.

Now that we know which one, run it again, put in a break at ext4.go:643. I would want to see:

We need to understand what size it is reading in, what content it is reading in, and from what position. It should be reading gdtSize bytes from right after the superblock. It does so in the tests, so something is slightly different with your filesystem. It probably has to do with a slightly different layout, or perhaps that it is not the 0th partition, or something similar, but let's track it down.

While we are at it, if you want to dd out the first 5 blocks and put them somewhere useful, that would help. Your stats shows blocksize of 4096, so dd if=/dev/nvme0n1p8 of=somefile bs=4096 count=5. It is all of 20k, shouldn't be too hard to find somewhere to share it. You might even be able to attach it here.

Buanderie commented 4 months ago

Okay... I got:

(gdb) info locals
bs = {array = 0xc000106800 "", len = 1024, cap = 1024}
err = {tab = 0x0, data = 0x0}
gdt = 0xc0001113b0
gdtBytes = {array = 0xc0002b2000 "", len = 14272, cap = 14272}
sb = 0xc0002a4588
superblockBytes = {array = 0xc000106c00 "", len = 1024, cap = 1024}
gdtSize = 14272
n = 14272
(gdb) p gdtSize
$1 = 14272
(gdb) p gdtBytes
$2 = {array = 0xc0002b2000 "", len = 14272, cap = 14272}
(gdb) p start
$3 = 32572964864
(gdb) p BootSectorSize
No symbol "BootSectorSize" in current context.
(gdb) p SuperblockSize
No symbol "SuperblockSize" in current context.

"BootSectorSize" and "SuperblockSize" being const set to (2 * 512)

Then, I dump the gdtBytes... (address is different from previous log, since it comes from another gdb run)

(gdb) dump binary memory ./gdtBytes.bin 0xc0002a6000 (0xc0002a6000 + 14272)

gdtBytes.zip

And here is the dd dump of the /dev/nvme0n1p8 partition, also... nvme0n1p8.zip

I don't know a single thing about ext4 format, so I'm hardly knowing what I'm doing :D I hope it will help others as well. But, thanks !

deitch commented 4 months ago

That helped. The calculation for finding the GDT is off. It is correct if the blocksize is 1024, but wrong if larger. It ends up reading the 0s in block 0 after the superblock, rather than skipping to the beginning of block 1. This should be fixable.

deitch commented 4 months ago

See #232