Trying on real HD - Githubissues

suntong commented 3 years ago

Hi again Ivan,

Have you try putting your raw image onto real HD?

https://github.com/iximiuz/docker-to-linux/blob/94ccfe3ef1257d31c8046fcbe08e0067972d16e8/create_image.sh#L25

This line is the reason that I'm asking as the extlinux was the boot loader that I've been using for years, until it gave me the trouble when me duplicating system from one machine to another --

A I have my specific ways of doing things, I usually do a brand new installation in one machine, and fully customize it to my comfort, then rsync it to another machine. The problem is, the extlinux boots fine from the first machine but just refuse to boot on the second machine no matter what I tried. I even tried to exclude all its relevant files when doing rsync it to another machine, then do a brand new extlinux installation there. But my new extlinux still refuse to boot.

I really hope that you've tried it and find it not a problem. thx.

iximiuz commented 3 years ago

Hi @suntong! I'm afraid I don't have suitable physical hardware around to test this. Let's keep this issue open with the hope that someone could check it out for us.

suntong commented 3 years ago

Sure, appreciate your attitude for helping!

ljleb commented 3 years ago

Hey people, I just tried booting the alpine image from a 4GB USB 2.0 stick I had in spare. I have burned the image using this line:

sudo dd bs=1G if=alpine/linux.img of="${USB_DEV}" conv=fdatasync

Here's the logs:

VID_20210613_144749852_exported_12871

Yep. Poor init got stabbed.

Actually, the last screen that doesn't contain crash information goes so fast I can't even see it bare eyes. I took a slow-mo video of the logs and did my best to extract useful information. Here's the last segment before the kernel panic:

[    1.590315] Mounting root: ok.
ok.
mkdir: can't create directory '/sysroot//sys': Read-only filesystem
mount: mounting /sys on /sysroot//sys failed: No such file or directory
mkdir: can't create directory '/sysroot//dev': Read-only filesystem
mount: mounting /dev on /sysroot//dev failed: No such file or directory
mkdir: can't create directory '/sysroot//proc': Read-only filesystem
mount: mounting /proc on /sysroot//proc failed: No such file or directory
mkdir: can't create directory '/sysroot//dev': Read-only filesystem
mount: mounting /dev/pts on /sysroot//dev/pts failed: No such file or directory
mkdir: can't create directory '/sysroot//dev': Read-only filesystem

I believe the root cause of the kernel panic might be that the filesystem is read-only. What do you think?

See #19 for my recent efforts on this topic.

ljleb commented 3 years ago

I finally succeeded to boot from all 3 distros! :tada:

However, I discovered after a while through my testing process that whether I ran an image with qemu before running it on bare metal or not sometimes changed the outcome of the test.

This might be related to initramfs's first execution.

Anyhow, I know that some combination of the following actions leads to a successful boot procedure every time (on my computer at least): 1) making the root file system writable, 2) replacing /dev/sda1 in syslinux.cfg by it's partition UUID associated to ${LOOPDEVICE} and 3) running the image with qemu once before dding it into a physical drive.

In order to achieve 1) and 2), I put the following code at this location (just before unmounting /os/mnt):

LOOPDEVICE_UUID="UUID=$(blkid | awk -F\" "/$(basename "${LOOPDEVICE}")/ {print \$2}")"

# update syslinux.cfg to use the device UUID
sed -i "s/\/dev\/sda1/${LOOPDEVICE_UUID}/" /os/mnt/boot/syslinux.cfg

# make root file system writable
printf '%s\n' "${LOOPDEVICE_UUID} / ext4 defaults 0 0" >> /os/mnt/etc/fstab

Note that making the root file system writable in this way doesn't seem to work for alpine. I documented how I got it working in #19.

I did try some of these combinations today. However, my current test results are not reliable because I learned about 3) afterwards. I might do a spreadsheet if I ever find the time... (dd takes forever to copy 1G over USB 2.0)

suntong commented 3 years ago

Thanks a lot @lebel-louisjacob for your detailed report! and congratulation on successfully booting from all 3 distros. what a achievement!

Let me close this now, and we can continue discussion here.

My questions are:

If booting from qemu, then you should have only one single disk, which should be /dev/sda1. Do you got any errors when leaving /dev/sda1 without changing it?
If booting from HD, then the UUID of the HD would most probably different from LOOPDEVICE_UUID, right? I.e., there should be another step to change UUID after dding it into a physical drive, right?

ljleb commented 3 years ago

Hey @suntong, thanks for your fast response! I'll do my best to answer you.

For your first question, if by "leaving /dev/sda1 without changing it", you mean exiting qemu without touching the root file system (i.e. by issuing commands like touch, mkdir or anything else), then I don't think so. At least, when issuing poweroff, I haven't recognized any message directly written to the terminal as an error. If any error occured, do you know where I could find the logs?

For your second question, I started to use device UUIDs instead of device paths in the image configuration because /dev/sda1 somehow couldn't be found when I tried booting the image on bare metal. Once I made that change, booting on bare metal started working. (Or rather, it looked like that was the case. However, that might also be because I booted the image with qemu once, just before actually booting on bare metal)

I thought using the loop UUID instead of /dev/sda1 would work because I recognized that /dev/sda1 in the output of blkid from within the qemu VM matched the UUID of the loop device. The loop device was no more available at the point of issuing blkid, but the UUID of /dev/sda1 was still the same as the loop device's, so I thought that the loop device's UUID was actually identifying the ext4 partition header, and not any physical device. Because of this, I believed that the "partition's UUID" was being copied over my USB stick when dding the image.

If I'm not wrong in believing that the loop device's UUID is in reality a partition UUID, then according to this, the loop device's UUID is actually stored in the ext4 partition header while executing create_image.sh. This would mean that dding i.e. the debian image over my USB stick should also copy the UUID of the partitions.

We would not require any more steps in the build process if my assumptions are valid.

suntong commented 3 years ago

I believed that the "partition's UUID" was being copied over my USB stick when dding the image

Are yes, that make sense. Your previous detailed explanation helped me fully understand the situation. thx!

iximiuz / docker-to-linux

Trying on real HD #17