hypriot / image-builder-rpi

SD card image for Raspberry Pi with Docker: HypriotOS
http://blog.hypriot.com/post/how-to-get-docker-working-on-your-favourite-arm-board-with-hypriotos/
MIT License
1.07k stars 168 forks source link

Support USB boot via root UUIDs #282

Closed mmastrac closed 5 years ago

mmastrac commented 5 years ago

Issue #172 asked for USB images, but this still requires some manual editing. Using the UUID=... syntax, the boot image could be made agnostic to SDcard or USB boot.

This should work but I haven't tested it yet:

cmdline.txt

dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=UUID=2a81f25a-2ca2-4520-a1a6-c9dd75527c3c rootfstype=ext4 cgroup_enable=cpuset cgroup_enable=memory swapaccount=1 elevator=deadline fsck.repair=yes rootwait

/etc/fstab

proc /proc proc defaults 0 0
UUID=7075-EEF7 /boot vfat defaults 0 0
UUID=2a81f25a-2ca2-4520-a1a6-c9dd75527c3c / ext4 defaults,noatime 0 1
mmastrac commented 5 years ago

Note that I confirmed that the UUIDs are the same as the above between two of my installs, so it should be safe to hard-code it. I'm using a simple boot SD card with the updated bootcode.bin and that has a different label from either of these two.

An alternative to using UUIDs might be using LABEL=, which map to the volume labels (possibly not as stable if users rename their partitions)

marclennox commented 5 years ago

I've been playing around with your suggestion, and here's what I found. For me the UUID values that I see after building the Hypriot RPI image are

proc /proc proc defaults 0 0
UUID=CD1A-87F4 /boot vfat defaults 0 0
UUID=478512cb-2705-4f50-a1f5-7111b0b26a5d / ext4 defaults,noatime 0 1

This works fine in /etc/fstab but I cannot get the UUID mount working in the boot cmdline.txt

I had a look at the raspbian image and indeed they use the PARTUUID in the boot cmdline.txt

After booting mine with /dev/sda2 I changed cmdline.txt to use PARTUUID= and this does indeed work, however unlike the UUID values, it seems that the PARTUUID changes every time I do a new build of the image.

So I think for this to work and be truly agnostic of sdcard and usbstick, the build process will need to update the cmdline.txt to be the correct PARTUUID. I'm not sure if there's a way to make PARTUUID constant for every build, or if there's a way to extract the PARTUUID from the built image each time.

Any suggestions would be appreciated, I'd really like to get this working.

mmastrac commented 5 years ago

@marclennox I suppose there are a few options:

1) Force a UUID when making the vfat/ext4 systems: https://github.com/hypriot/image-builder-raw/blob/master/builder/rpi/build.sh#L53 2) Use a loopback block to mount and read the UUID out 3) Use blkid through https://linux.die.net/man/1/guestfish (possible, but not totally sure if that works)

marclennox commented 5 years ago

Thanks @mmastrac for your suggestions, it pointed me in the right direction to keep digging. I ended up getting it working similar to how Raspbian base images are built (which support PARTUUID based mounting for USB/SD agnostic images).

So basically I had to rearrange a few things in the build.sh script and then add the following line to extract and export the IMAGE_ID making it available in the chroot-script.sh.

export IMAGE_ID=$(dd if="${BUILD_RESULT_PATH}/${HYPRIOT_IMAGE_NAME}" skip=440 bs=1 count=4 2>/dev/null | xxd -e | cut -f 2 -d' ')

Then in the chroot-script.sh I changed the lines that create /boot/cmdline.txt and /etc/fstab as follows:

# boot/cmdline.txt
cat << _EOF_ > /boot/cmdline.txt
dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=PARTUUID=${IMAGE_ID}-02 rootfstype=ext4 cgroup_enable=cpuset cgroup_enable=memory swapaccount=1 elevator=deadline fsck.repair=yes rootwait quiet init=/usr/lib/raspi-config/init_resize.sh
_EOF_
# create /etc/fstab
cat << _EOF_ > /etc/fstab
proc /proc proc defaults 0 0
PARTUUID=${IMAGE_ID}-01 /boot vfat defaults 0 0
PARTUUID=${IMAGE_ID}-02 / ext4 defaults,noatime 0 1
_EOF_

I flashed both an SD card and USB stick with the resulting image and both boot and run perfectly in a Pi3+

marclennox commented 5 years ago

Seems I may have spoken too soon, there is one glitch with this approach. It seems the cloud-init root partition resize is failing with the following message

Apr 08 02:05:19 rpi-gateway cloud-init[300]: 2019-04-08 02:05:19,764 - cc_resizefs.py[WARNING]: Device '/dev/PARTUUID=f589f97d-02' did not exist. cannot resize: dev=/dev/root mnt_point=/ path=/

I'm a little confused about this since there is a screen that resize script specified in the /boot/cmdline.txt file indicates success when it first runs.

/usr/lib/raspi-config/init_resize.sh

However the root partition is indeed not getting resized, and the cloud-init error would indicate to me a but in the resize python script that doesn't properly handle PARTUUID identified partitions.

mmastrac commented 5 years ago

Looks like that might be a bug here: https://github.com/cloudsigma/cloud-init/blob/master/cloudinit/config/cc_resizefs.py#L72

I don't know the difference between PARTUUID/UUID for the kernel cmdline, but I'll see if I can think through something.

mmastrac commented 5 years ago

You might be able to set a root partition of using the dev directly like /dev/disk/by-partuuid/... like:

https://wiki.archlinux.org/index.php/persistent_block_device_naming#by-partuuid

marclennox commented 5 years ago

Using the UUID in /boot/cmdline.txt definitely does not work on the Pi3+. I've read enough forum posts and tried enough variations to be confident in that.

PARTUUID definitely does work, and is precisely how the Raspbian images are able to support both USB and SD.

I'm really confused though why there seems to be 2 different scripts responsible for root partition resizing. As I mentioned, the raspi-config script indicates success when you first boot the new image... but it's definitely not resizing the partition.

marclennox commented 5 years ago

Unfortunately doesn't work with /dev/disk/by-partuuid/9e5083b7-02... just hangs indefinitely on the boot screen waiting for the partition.

mmastrac commented 5 years ago

Looks like this bug here - might be possible to pull a more modern cloud-init? https://bugs.launchpad.net/cloud-init/+bug/1684869

mmastrac commented 5 years ago

Yep, confirmed that cloud-init 17.2 has fixes for this. We need to dig out where the upstream cloud-init comes from and either patch or replace it.

marclennox commented 5 years ago

Cloud init is installed in the chroot-script

# install cloud-init
apt-get install -y \
  cloud-init \
  ssh-import-id

On the resulting Pi, I get

$ cloud-init --version
cloud-init 0.7.9
mmastrac commented 5 years ago

In theory all we need to do is patch util.py and add this right below the if found.startswith("UUID=") section:

    if found.startswith("PARTUUID="):
        disks_path = ("/dev/disk/by-partuuid/" +
                      found[len("PARTUUID="):].lower())
        if os.path.exists(disks_path):
            return disks_path
        results = find_devs_with(found)
        if results:
            return results[0]
        # we know this doesn't exist, but for consistency return the path as
        # it /would/ exist
        return disks_path

This could probably be a patch in the image-builder-rpi project until the base image gets updated.

marclennox commented 5 years ago

Cool, will give that a try tomorrow and let you know, thanks @mmastrac !

marclennox commented 5 years ago

So I tried updating the cloud-init service to the latest, from the testing repository. I isolated with pinning so that only cloud-init would be installed (including dependencies) from the testing repo.

Unfortunately now it seems that it's not picking up the /boot/user-data file, or it's expecting something different. I even trimmed it down to a bare-bones cloud-config to just create the default user/password, which doesn't seem to be working.

I can't login after boot since the default user is not getting created, so it's hard to debug what's going on.

marclennox commented 5 years ago

There's a message in the boot logs from cloud-init stating

Used fallback datasource

I'm guessing that something has fundamentally changed from 0.7.9 to 18.1 in terms of where to look for the local user-data file...

mmastrac commented 5 years ago

Is it possible to point at 17.2 specifically? The user-data file suggests that it's pretty version sensitive. 17.2 seems to be more bugfix than radical change.

marclennox commented 5 years ago

Yeah I'll just need to find an apt source for it... or build it from scratch I suppose.

marclennox commented 5 years ago

I was able to get this working by applying the following patch to /usr/lib/python3/dist-packages/cloudinit/config/cc_resizefs.py in 0.7.9

--- cc_resizefs.py  2019-04-10 11:22:50.000000000 -0400
+++ cc_resizefs.patched.py  2019-04-10 11:23:43.000000000 -0400
@@ -86,6 +86,8 @@
         return "/dev/disk/by-label/" + found[len("LABEL="):]
     if found.startswith("UUID="):
         return "/dev/disk/by-uuid/" + found[len("UUID="):]
+    if found.startswith("PARTUUID="):
+        return "/dev/disk/by-partuuid/" + found[len("PARTUUID="):]

     return "/dev/" + found