RPi-Distro / raspi-config

Configuration tool for the Raspberry Pi
Other
565 stars 206 forks source link

Unable to enable overlayfs on a CM4 with eMMC and NVMe storage #241

Closed framps closed 4 months ago

framps commented 5 months ago

I recently tried to enable the overlayfs with raspi-config on my CM4 which has both eMMC and NVMe storage. It works if I have the system booted from eMMC but fails when I boot the system from NVMe.

pi@raspberrypi-bookworm64-desktop-cm4:~ $ lsblk
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
mmcblk0      179:0    0  29.1G  0 disk 
├─mmcblk0p1  179:1    0   512M  0 part 
└─mmcblk0p2  179:2    0  28.6G  0 part 
mmcblk0boot0 179:32   0     4M  1 disk 
mmcblk0boot1 179:64   0     4M  1 disk 
nvme0n1      259:0    0 119.2G  0 disk 
├─nvme0n1p1  259:1    0   512M  0 part /boot/firmware
└─nvme0n1p2  259:2    0 118.7G  0 part /
mount | egrep "nvme|over"
/dev/nvme0n1p2 on / type ext4 (rw,noatime)
/dev/nvme0n1p1 on /boot/firmware type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro)

wheras the output on the eMMC booted system is correct

mount | egrep "mmc|over"
/dev/mmcblk0p2 on /media/root-ro type ext4 (ro,relatime)
overlayroot on / type overlay (rw,relatime,lowerdir=/media/root-ro,upperdir=/media/root-rw/overlay,workdir=/media/root-rw/overlay-workdir/_)
/dev/mmcblk0p1 on /boot/firmware type vfat (ro,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro)
framps commented 5 months ago

Just some additional information:

uname -a;cat /etc/os-release 
Linux raspberrypi-bookworm64-desktop-cm4 6.1.0-rpi7-rpi-v8 #1 SMP PREEMPT Debian 1:6.1.63-1+rpt1 (2023-11-24) aarch64 GNU/Linux
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

raspi-config version: 20240313

XECDesign commented 4 months ago

How is the nvme drive attached? Are you using an nvme pcie adapter or usb?

framps commented 4 months ago

The NVMe SSD is connected without any adapter directly to the PCIe slot.

See here for some more details about the CM4 I use.

XECDesign commented 4 months ago

Just got the things I needed to try to reproduce the issue. So far, it's working as expected for me.

I flashed the latest eeprom, flashed the nvme drive using rpiboot, booted, updated everything, ran raspi-config to enable overlay fs, rebooted and it's working as expected.

I'll try with the steps in your article in case there's any weirdness going on.

When it's not working, what is the output of cat /proc/cmdline? If you run sudo journalctl -b0 do you see any issues?

XECDesign commented 4 months ago

Yeah, I can't make it go wrong. One potential pitfall might be flashing an image onto the emmc and the nvme in one go. At that point they would both have the same disk id, so there would be potential for the system to get confused about what partition is meant to be what.

I'd check the output of 'sudo fdisk -l' and check that fstab on each device matches the PARTUUID you'd expect in both fstab files (which is derived from the disk id).

If that looks okay, then I don't have any clue as to what might be wrong and might need more information on how to reliably reproduce the issue.

framps commented 4 months ago

Thank you very much you take a look into the issue and try to recreate the issue - but unfortunately fail.

I collected various information I think may help you to isolate the root cause of the issue and included the information you asked for. See the attached file below.

The disk-id is different on eMMC and NVMe:

pi@raspberrypi-bookworm64-lite-cm4:~ $ blkid
/dev/nvme0n1p1: LABEL_FATBOOT="bootfs" LABEL="bootfs" UUID="EB8A-F3C7" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="a560a466-01"
/dev/nvme0n1p2: UUID="04508ed6-922a-4a0f-9b8c-757d677451c6" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="a560a466-02"
/dev/mmcblk0p1: LABEL_FATBOOT="bootfs" LABEL="bootfs" UUID="B76E-4EDA" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="7d09ecf4-01"
/dev/mmcblk0p2: UUID="752b30df-8e99-45d2-b8fa-8924a53c9d2b" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="7d09ecf4-02"

I checked the journal and found some warnings:

Apr 18 17:57:03 raspberrypi-bookworm64-desktop-cm4 systemd-sysctl[285]: Couldn't write '1' to 'kernel/unprivileged_userns_clone', ignoring: No such file or directory
Apr 18 17:57:04 raspberrypi-bookworm64-desktop-cm4 (udev-worker)[304]: event1: Process '/usr/sbin/th-cmd --socket /var/run/thd.socket --passfd --udev' failed with exit code 1.
Apr 18 17:57:04 raspberrypi-bookworm64-desktop-cm4 (udev-worker)[297]: event0: Process '/usr/sbin/th-cmd --socket /var/run/thd.socket --passfd --udev' failed with exit code 1.
Apr 18 17:57:04 raspberrypi-bookworm64-desktop-cm4 kernel: vc_sm_cma: module is from the staging directory, the quality is unknown, you have been warned.
Apr 18 17:57:04 raspberrypi-bookworm64-desktop-cm4 kernel: snd_bcm2835: module is from the staging directory, the quality is unknown, you have been warned.
Apr 18 17:57:04 raspberrypi-bookworm64-desktop-cm4 kernel: bcm2835_mmal_vchiq: module is from the staging directory, the quality is unknown, you have been warned.
Apr 18 17:57:04 raspberrypi-bookworm64-desktop-cm4 kernel: bcm2835_mmal_vchiq: module is from the staging directory, the quality is unknown, you have been warned.
Apr 18 17:57:04 raspberrypi-bookworm64-desktop-cm4 kernel: bcm2835_codec: module is from the staging directory, the quality is unknown, you have been warned.
Apr 18 17:57:04 raspberrypi-bookworm64-desktop-cm4 kernel: bcm2835_v4l2: module is from the staging directory, the quality is unknown, you have been warned.
Apr 18 17:57:04 raspberrypi-bookworm64-desktop-cm4 kernel: bcm2835_isp: module is from the staging directory, the quality is unknown, you have been warned.
Apr 18 17:57:05 raspberrypi-bookworm64-desktop-cm4 kernel: rpivid_hevc: module is from the staging directory, the quality is unknown, you have been warned.

There are some errors related to /dev/video but I don't think they cause the issue.

I noticed the rpi-eeprom-update output looks odd: Latest date is older than current date. But the current NVMe system bootloader date is identical to the bootloader date of the eMMC system.

pi@raspberrypi-bookworm64-desktop-cm4:~ $ rpi-eeprom-update 
BOOTLOADER: up to date
[issue241_overlayfs_info1.txt](https://github.com/RPi-Distro/raspi-config/files/15027501/issue241_overlayfs_info1.txt)
[issue241_overlayfs_info1.txt](https://github.com/RPi-Distro/raspi-config/files/15027505/issue241_overlayfs_info1.txt)

   CURRENT: Mon Jan 22 10:41:21 AM UTC 2024 (1705920081)
    LATEST: Wed Jan 11 05:40:52 PM UTC 2023 (1673458852)
   RELEASE: default (/lib/firmware/raspberrypi/bootloader-2711/default)
            Use raspi-config to change the release.

  VL805_FW: Using bootloader EEPROM
     VL805: version unknown. Try sudo rpi-eeprom-update
   CURRENT: 
    LATEST: 

Please see the attached file for the information you asked for in addition to other information I think may be helpful for you:

issue241_overlayfs_info1.txt

Please let me know if you need any additional information.

XECDesign commented 4 months ago

I'm not seeing anything unexpected that would prevent it from working.

Another thing to check - is overlayroot in the initramfs file?

lsinitramfs /boot/firmware/initramfs8 | grep overlayroot

If you see scripts/init-bottom/overlayroot then that's fine.

The next step would be to enable initramfs debug by adding debug to the end of /boot/firmware/cmdline.txt.

It looks like the read-only boot partition option worked, so you'll need to remount it as read-write with sudo mount -o remount,rw /boot/firmware first.

After rebooting, there should be some log files - /run/initramfs/overlayroot.log and /run/initramfs/initramfs.debug. Hopefully they should tell us exactly what's going wrong.

framps commented 4 months ago

If you see scripts/init-bottom/overlayroot then that's fine.

On the NVMe system I get

pi@raspberrypi-bookworm64-desktop-cm4:~ $ lsinitramfs /boot/firmware/initramfs8 | grep overlayroot
pi@raspberrypi-bookworm64-desktop-cm4:~ $ 

On the eMMC system I get

pi@raspberrypi-bookworm64-lite-cm4:~ $ lsinitramfs /boot/firmware/initramfs8 | grep overlayroot
scripts/init-bottom/overlayroot

So you're on the right track. Do you know why it's missing on the NVMe system? Not sure whether you noticed the eMMC system is a lite os whereas the NVMe system is a desktop os.

XECDesign commented 4 months ago

Does it start working if you run the following?

sudo mount -o remount,rw /boot/firmware
sudo apt install --reinstall overlayroot -y
sudo reboot
framps commented 4 months ago
pi@raspberrypi-bookworm64-desktop-cm4:~ $ sudo mount -o remount,rw /boot/firmware
mount: (hint) your fstab has been modified, but systemd still uses
       the old version; use 'systemctl daemon-reload' to reload.
pi@raspberrypi-bookworm64-desktop-cm4:~ $ sudo apt install --reinstall overlayroot -y
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 17 not upgraded.
Need to get 0 B/15.4 kB of archives.
After this operation, 0 B of additional disk space will be used.
(Reading database ... 130689 files and directories currently installed.)
Preparing to unpack .../overlayroot_0.18.debian13_all.deb ...
Unpacking overlayroot (0.18.debian13) over (0.18.debian13) ...
Setting up overlayroot (0.18.debian13) ...
Processing triggers for man-db (2.11.2-2) ...
Processing triggers for initramfs-tools (0.142) ...
Scanning processes...                                                                                            
Scanning processor microcode...                                                                                  
Scanning linux images...                                                                                         

Running kernel seems to be up-to-date.

The processor microcode seems to be up-to-date.

No services need to be restarted.

No containers need to be restarted.

No user sessions are running outdated binaries.

No VM guests are running outdated hypervisor (qemu) binaries on this host.

... rebooted system ...

pi@raspberrypi-bookworm64-desktop-cm4:~ $ lsinitramfs /boot/firmware/initramfs8 | grep overlayroot

Looks like the overlayroot is not added to the initramfs.

framps commented 4 months ago

I just detected

pi@raspberrypi-bookworm64-desktop-cm4:~ $ sudo apt-get upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
The following packages have been kept back:
  libcamera-ipa libcamera-tools libpipewire-0.3-0 libpipewire-0.3-modules libspa-0.2-bluetooth
  libspa-0.2-modules linux-headers-rpi-2712 linux-headers-rpi-v8 linux-image-rpi-2712 linux-image-rpi-v8
  pipewire pipewire-bin pipewire-libcamera pipewire-pulse python3-libcamera rpi-eeprom rpicam-apps
0 upgraded, 0 newly installed, 0 to remove and 17 not upgraded.

I have no clue why these packages are kept back.

XECDesign commented 4 months ago

Try full-upgrade instead of just upgrade. (Make sure you run that mount command first)

And what does apt policy raspi-firmware say?

Edit: And cat /etc/default/raspi-firmware?

framps commented 4 months ago
pi@raspberrypi-bookworm64-desktop-cm4:~ $ mount | grep nvme
/dev/nvme0n1p2 on /media/root-ro type ext4 (ro,relatime)
/dev/nvme0n1p1 on /boot/firmware type vfat (ro,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro)

Now the overlayroot is included in the initramfs and I can enable the overlayfs on the NMVe system :+1:

pi@raspberrypi-bookworm64-desktop-cm4:~ $ sudo apt policy raspi-firmware
raspi-firmware:
  Installed: 1:1.20240306+ds-1+rpt1
  Candidate: 1:1.20240306+ds-1+rpt1
  Version table:
 *** 1:1.20240306+ds-1+rpt1 500
        500 http://archive.raspberrypi.com/debian bookworm/main arm64 Packages
        500 http://archive.raspberrypi.com/debian bookworm/main armhf Packages
        100 /var/lib/dpkg/status
     1.20220830+ds-1 500
        500 http://deb.debian.org/debian bookworm/non-free-firmware arm64 Packages
        500 http://deb.debian.org/debian bookworm/non-free-firmware armhf Packages

and

pi@raspberrypi-bookworm64-desktop-cm4:~ $ cat /etc/default/raspi-firmware 
# If set to 'auto' (default), raspi-firmware will automatically copy
# and rename supported kernels into /boot/firmware
#
#KERNEL=auto

# If set to 'auto' (default), raspi-firmware will automatically copy
# and rename supported initramfs images into /boot/firmware
#
#INITRAMFS=auto

# If set to yes (NOT default), supported kernels will not trigger
# initramfs updates on install and upgrade. If this is the desired
# behaviour, you may also wish to set update_initramfs=no in
# /etc/initramfs-tools/update-initramfs.conf
#
#SKIP_INITRAMFS_GEN=no

For me the issue is solved. Thank you very much for your help :+1:

If you want to know why the overlayroot failed to be added to the initramfs I can revert the NVMe system and can execute any commands for you to locate the root cause. I installed the desktop image available for download from the official website and thus don't understand why this happened.

Otherwise I'll close this issue.

XECDesign commented 4 months ago

If you can reproduce the problem starting from a clean install with the exact list of commands you ran to get into that state, that would be helpful.

It looks like NVMe vs eMMC wasn't the issue, but the steps taken post-install were somehow different. The main issue seems to be that raspi-firmware's initramfs hook wasn't running, but I have no idea how to get to that state.

framps commented 4 months ago

Ok. I will install a fresh desktop os on NVMe and a fresh lite os on eMMC and check. If I can enable overlayroot on NVMe and eMMC I'll close this issue.

If not I'll come back :smile:

framps commented 4 months ago

... managed to mount eMMC and NVMe with mass-storage-gadget again ...

framps commented 4 months ago

I installed lite on eMMC and desktop on NVMe and get

pi@raspberrypi:~ $ lsinitramfs /boot/firmware/initramfs8 | grep overlayroot
pi@raspberrypi:~ $ 

on the NVMe system.

Looks like there is some issue on the desktop image. The image I use is 2024-03-15-raspios-bookworm-arm64.img

framps commented 4 months ago

Today I setup the CM4 again from scratch and documented the steps I executed in detail. Hope you're now able to reproduce the issue on your system.

Images used:

-rw-r--r--  1 framp framp  5951717376 Apr 18 22:15  2024-03-15-raspios-bookworm-arm64.img
-rw-r--r--  1 framp framp  2768240640 Mar 21 22:48  2024-03-15-raspios-bookworm-arm64-lite.img

Lite -> eMMC
Desktop -> NVMe

Used mass-storage-gadget64 on a Linux system to access eMMC and NVMe and dd to copy the two images to eMMC and NVMe Code level: c5d89c2 - (3 days ago) bootfiles: Use the bootfiles loader for mass-storage gadget images - Tim Gover

sudo dd if=2024-03-15-raspios-bookworm-arm64.img of=/dev/sdd bs=2MiB
sudo dd if=2024-03-15-raspios-bookworm-arm64-lite.img of=/dev/sdc bs=2MiB

Set boot order with usbboot

[all]
BOOT_UART=0
WAKE_ON_GPIO=1
POWER_OFF_ON_HALT=0

# Boot Order Codes, from https://www.raspberrypi.com/documentation/computers/raspberry-pi.html#BOOT_ORDER
# Try SD first (1), followed by, USB PCIe, NVMe PCIe, USB SoC XHCI then network
BOOT_ORDER=0xf25641

# Set to 0 to prevent bootloader updates from USB/Network boot
# For remote units EEPROM hardware write protection should be used.
ENABLE_SELF_UPDATE=1

Started CM4 which now starts with the eMMC system:

pi@raspberrypi:~ $ lsblk
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
mmcblk0      179:0    0  29.1G  0 disk 
|-mmcblk0p1  179:1    0   512M  0 part /boot/firmware
`-mmcblk0p2  179:2    0  28.6G  0 part /
mmcblk0boot0 179:32   0     4M  1 disk 
mmcblk0boot1 179:64   0     4M  1 disk 
nvme0n1      259:0    0 119.2G  0 disk 
|-nvme0n1p1  259:1    0   512M  0 part 
`-nvme0n1p2  259:2    0     5G  0 part 
pi@raspberrypi:~ $ blkid
/dev/nvme0n1p1: LABEL_FATBOOT="bootfs" LABEL="bootfs" UUID="50C8-AEAE" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="617a2abd-01"
/dev/nvme0n1p2: LABEL="rootfs" UUID="fc7a1f9e-4967-4f41-a1f5-1b5927e6c5f9" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="617a2abd-02"
/dev/mmcblk0p1: LABEL_FATBOOT="bootfs" LABEL="bootfs" UUID="44FC-6CF2" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="38ddbc66-01"
/dev/mmcblk0p2: LABEL="rootfs" UUID="93c89e92-8f2e-4522-ad32-68faed883d2f" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="38ddbc66-02"
pi@raspberrypi:~ $ mount | grep mmc
/dev/mmcblk0p2 on / type ext4 (rw,noatime)
/dev/mmcblk0p1 on /boot/firmware type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro)

Changed boot order to boot from NVMe with usbboot

[all]
BOOT_UART=0
WAKE_ON_GPIO=1
POWER_OFF_ON_HALT=0

# Boot Order Codes, from https://www.raspberrypi.com/documentation/computers/raspberry-pi.html#BOOT_ORDER
# Try SD first (1), followed by, USB PCIe, NVMe PCIe, USB SoC XHCI then network
BOOT_ORDER=0xf25416

# Set to 0 to prevent bootloader updates from USB/Network boot
# For remote units EEPROM hardware write protection should be used.
ENABLE_SELF_UPDATE=1

Started CM4 which now starts with the NVMe system:


pi@raspberrypi:~ $ lsblk
NAME         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
mmcblk0      179:0    0  29.1G  0 disk 
├─mmcblk0p1  179:1    0   512M  0 part 
└─mmcblk0p2  179:2    0  28.6G  0 part 
mmcblk0boot0 179:32   0     4M  1 disk 
mmcblk0boot1 179:64   0     4M  1 disk 
nvme0n1      259:0    0 119.2G  0 disk 
├─nvme0n1p1  259:1    0   512M  0 part /boot/firmware
└─nvme0n1p2  259:2    0 118.7G  0 part /
pi@raspberrypi:~ $ blkid
/dev/nvme0n1p1: LABEL_FATBOOT="bootfs" LABEL="bootfs" UUID="50C8-AEAE" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="7badae65-01"
/dev/nvme0n1p2: LABEL="rootfs" UUID="fc7a1f9e-4967-4f41-a1f5-1b5927e6c5f9" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="7badae65-02"
/dev/mmcblk0p1: LABEL_FATBOOT="bootfs" LABEL="bootfs" UUID="44FC-6CF2" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="38ddbc66-01"
/dev/mmcblk0p2: LABEL="rootfs" UUID="93c89e92-8f2e-4522-ad32-68faed883d2f" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="38ddbc66-02"
pi@raspberrypi:~ $ mount | grep nvme
/dev/nvme0n1p2 on / type ext4 (rw,noatime)
/dev/nvme0n1p1 on /boot/firmware type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro)
pi@raspberrypi:~ $ lsinitramfs /boot/firmware/initramfs8 | grep overlayroot
pi@raspberrypi:~ $ 
``
XECDesign commented 4 months ago

Thanks for that. I'll take a closer look again when I get a chance.

XECDesign commented 4 months ago

In your instructions, I don't see where you've enabled the overlay. I wouldn't expect the overlayroot file to be in initramfs before the overlayroot package is installed (which raspi-config does when you enable that option).

framps commented 4 months ago

That's true. I expected the overlayrootfs being included in the initramfs all the time. I didn't know the initramfs is rebuilt when the overlayrootfs is enabled.

So I now enabled the overlayrootfs with rpi-config - saw messages the initramfs was rebuilt - and the overlayrootfs was now enabled successfully :+1:

I missed to enable the overlayfs in my latest test - unfortunately. But I'm sure I enabled the overlayrootfs when I created this issue.

I just enabled the overlayfs only and no boot partition protection in raspi-config what I did in the past. Unfortunately the initramfs is not rebuilt when I turn off overlayfs. Not sure whether this makes the difference. I will now start from scratch and setup both systems again and try again to enable the overlayrootfs.

XECDesign commented 4 months ago

Unfortunately the initramfs is not rebuilt when I turn off overlayfs. Not sure whether this makes the difference.

Nope, that's what I'd expect. The overlayroot package is not removed when overlayfs is disabled. It's toggled through cmdline.txt, so there's no need to initramfs to be rebuilt more than once.

framps commented 4 months ago

NP. But if the overlayfs would have been removed from initramfs when disabling the rootfs I would have immediately been able to check whether the parallel enabling of rootfs and bootpartition to set to ro causes the issue. So now I have to start from scratch again. ... Just doing that :wink:

framps commented 4 months ago
pi@raspberrypi:~ $ sudo mount | grep nvme
/dev/nvme0n1p2 on /media/root-ro type ext4 (ro,relatime)
/dev/nvme0n1p1 on /boot/firmware type vfat (ro,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro)

Everything is fine now. I frankly don't understand why it works now - but I'm glad it works now :smile: .

I remember the OSes I had installed on my CM4 were not current and I updated to the latest available code before I tried to reproduce the issue again.

Sorry that you had to waste your precious time with the issue :cry:

XECDesign commented 4 months ago

No worries. Glad it's all working in the end.