diskfs / go-diskfs

MIT License
515 stars 113 forks source link

corrupt GPT with examples/efi_create.go #180

Closed zizhengwu closed 1 year ago

zizhengwu commented 1 year ago

Hi, thanks for the great library that I don't have to struggle with bash!

I tried following examples/efi_create.go to create a disk image with EFI system partition. See my minimum code here: https://github.com/zizhengwu/use-diskfs/blob/master/main.go

The created disk image looks fine

sudo losetup -fP --show /tmp/disk.img
sudo mkdir /mnt/temp_efi
sudo mount /dev/loop0 /mnt/temp_efi
tree /mnt/temp_efi

/dev/loop0
/mnt/temp_efi
└── EFI
    └── BOOT
        └── BOOTX64.EFI

3 directories, 1 file

However, when I do sudo gdisk -l /tmp/disk.img, I got corrupt GPT

❯ sudo gdisk -l /tmp/disk.img
GPT fdisk (gdisk) version 1.0.9

Caution: invalid main GPT header, but valid backup; regenerating main header
from backup!

Warning: Invalid CRC on main header data; loaded backup partition table.
Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
on the recovery & transformation menu to examine the two tables.

Warning! Main partition table CRC mismatch! Loaded backup partition table
instead of main partition table!

Warning! One or more CRCs don't match. You should repair the disk!
Main header: ERROR
Backup header: OK
Main partition table: ERROR
Backup partition table: OK

Partition table scan:
  MBR: MBR only
  BSD: not present
  APM: not present
  GPT: damaged

Found valid MBR and corrupt GPT. Which do you want to use? (Using the
GPT MAY permit recovery of GPT data.)
 1 - MBR
 2 - GPT
 3 - Create blank GPT

I'm wondering if [this is expected, I misunderstood examples/efi_create.go, or that's a bug with examples/efi_create.go/diskfs library]?

deitch commented 1 year ago

I am able to reproduce it.

This is interesting. When I walk through it, I get identical bytes. But when I xxd the output, here is primary partition table:

00000200: 5252 6141 0000 0000 0000 0000 0000 0000  RRaA............
00000210: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000220: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000230: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000240: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000250: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000260: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000270: 0000 0000 0000 0000 0000 0000 0000 0000  ................

And secondary:

067ffe00: 4546 4920 5041 5254 0000 0100 5c00 0000  EFI PART....\...
067ffe10: 666a 1705 0000 0000 ff3f 0300 0000 0000  fj.......?......
067ffe20: 0100 0000 0000 0000 2200 0000 0000 0000  ........".......
067ffe30: df3f 0300 0000 0000 8629 7e36 e4ad d445  .?.......)~6...E
067ffe40: b718 d0d9 e1ed 02a2 0200 0000 0000 0000  ................
067ffe50: 8000 0000 8000 0000 c1eb 3c8f 0000 0000  ..........<.....

Primary got all zeroed out. That is strange. And interestingly, that RRaA is the magic for an MBR. Hmm...

deitch commented 1 year ago

This is breaking it:

spec := diskpkg.FilesystemSpec{Partition: 0, FSType: filesystem.TypeFat32}
fs, err := disk.CreateFilesystem(spec)
deitch commented 1 year ago

Oh, ouch. All of that for a silly mistake. PR coming shortly.

zizhengwu commented 1 year ago

@deitch Thank you so much for the quick fix!

I was able to verify sudo gdisk -l /tmp/disk.img is now happy.

However, it seems after the fix, I'm unable to mount the disk image anymore.

Steps to reproduce:

  1. Clone the latest https://github.com/zizhengwu/use-diskfs. I updated the repo to use the latest go-diskfs.
  2. Run main.go
  3. sudo gdisk -l /tmp/disk.img is now happy. Thanks for the fix!
  4. Run
    sudo losetup -fP --show /tmp/disk.img
    sudo mkdir /mnt/temp_efi
    sudo mount /dev/loop0 /mnt/temp_efi

Actual result:

/dev/loop0
mount: /mnt/temp_efi: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.

Expected behavior: I should be able to mount the disk image and inspect the content as seen in the original comment.

I'm wondering if I misunderstood anything (I'm not experienced in this domain so very appreciate the simplicity of the diskfs library) or that's a bug with examples/efi_create.go/diskfs library. Thanks!

Appendix: cleanup of step 4

sudo umount /mnt/temp_efi
sudo rmdir /mnt/temp_efi
sudo losetup -d /dev/loop0
deitch commented 1 year ago
# losetup -fP --show /tmp/disk.img
/dev/loop4
# mount /dev/loop4 /mnt
# ls -l /mnt
total 1
drwxr-xr-x 3 root root 512 May  1  2023 EFI

Works fine. I just tried it on macOS and Linux. Are you sure you have the right loop device? I don't see any response from your losetup command.

zizhengwu commented 1 year ago

Hi @deitch, thank you so much for helping troubleshoot and the great library.

Sadly, I'm still reproducing the error with the steps described in the above comment.

This time, I also recorded a video (very short; 31 seconds) to demonstrate the issue: https://www.youtube.com/watch?v=WP1QIkRGlJw I tried on 2 PCs (debian) and my macOS--same issue.

When you have time, it would be great if you can help me understand:

  1. Did I do anything wrong in my recorded video when creating the disk image and the follow-up testing?
  2. In terms of your testing Works fine. I just tried it on macOS and Linux, would it be possible that [you were using the old https://github.com/zizhengwu/use-diskfs without the latest diskfs patch you made, or the /tmp/disk.img loop device got mixed up with the old one]?
  3. Do you mind testing again with the latest HEAD https://github.com/zizhengwu/use-diskfs/blob/master/main.go running both sudo gdisk -l /tmp/disk.img and the mounting steps? In my observation,
    • before your fix, gdisk errors out and mounting works fine
    • after the fix, gdisk works fine and mounting errors out
    • gdisk and mounting are never both happy either before or after the fix

Thank you again for your patience with me!

deitch commented 1 year ago

Oh, interesting, it actually cannot read it. I wonder why.

deitch commented 1 year ago

Even more interestingly, if I dd the partition out of the disk.img, and then read it or mount it, no problem..

deitch commented 1 year ago

Ah! losetup doesn't understand gpt table!

$ gdisk -l disk.img
GPT fdisk (gdisk) version 1.0.8

Partition table scan:
  MBR: not present
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with corrupt MBR; using GPT and will write new
protective MBR on save.
Disk disk.img: 212992 sectors, 104.0 MiB
Sector size (logical): 512 bytes
Disk identifier (GUID): 591ED918-6DD0-409B-A855-C6B2D85C8879
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 212959
Partitions will be aligned on 2048-sector boundaries
Total free space is 12220 sectors (6.0 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048          202753   98.0 MiB    EF00  EFI System

We start at sector 2048, block size of 512, so partition starts at 1048576:

$ sudo losetup -fP --show --offset=1048576  ./disk.img
/dev/loop12
$ sudo mount /dev/loop12 /mnt/temp_efi/
$ ls -l /mnt/temp_efi/
total 1
drwxr-xr-x 3 root root 512 May  7 14:58 EFI
zizhengwu commented 1 year ago

Thank you @deitch again so much for helping out! This resolves my confusion.