johang / sd-card-images

Scripts to build bootable SD card images with Debian for various single-board computers
https://sd-card-images.johang.se
GNU General Public License v3.0
159 stars 35 forks source link

Reboot fails on Ubuntu noble and oracular #155

Open LupeChristoph opened 3 days ago

LupeChristoph commented 3 days ago

I'm using a Nanopi R4S which alas, has no serial console, so I can't really debug this.

Status of the Ubuntu images: jammy reboots OK noble hangs with the power LED on, no network oracular same as noble plucky same

Incidentally, Debian bookworm is OK, trixie starts both ethernets but is not accessible (nmap says "All 1000 scanned ports on 172.17.0.129 are in ignored states." on both ethernet interfaces). I gave up after that. But I will open a ticket for this.

OpenWRT from their download is OK with 23.03.3 and .4.

LupeChristoph commented 2 days ago

The problem with trixie was due to a bad microSD card.

Reboot failures with Debian, using a good microSD card: bookworm: works OK trixie: fails, SYS LED stays off sid: same as trixie rc-buggy: same as trixie

The strange thing is that trixie, sid and rc-buggy are not accessible with SSH after the first boot of an image. After the second or third boot, SSH is available. As with reboot, bookworm is OK.

LupeChristoph commented 2 days ago

I've acquired a USB-to-serial adapter and dismantled the R4S to connect it. I can now see the console output, and I found the problem.

[  OK  ] Reached target reboot.target - System Reboot.
[  117.632275] reboot: Restarting system

U-Boot TPL 2024.10johang-dirty (Nov 01 2024 - 02:10:40)
lpddr4_set_rate: change freq to 400MHz 0, 1
Channel 0: LPDDR4, 400MHz
BW=32 Col=10 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=2048MB
Channel 1: LPDDR4, 400MHz
BW=32 Col=10 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=2048MB
256B stride
lpddr4_set_rate: change freq to 800MHz 1, 0
Trying to boot from BOOTROM
Returning to boot ROM...

U-Boot SPL 2024.10johang-dirty (Nov 01 2024 - 02:10:40 +0000)
Trying to boot from MMC2
mmc_load_image_raw_sector: mmc block read error
Trying to boot from MMC2
mmc_load_image_raw_sector: mmc block read error
SPL: failed to boot from all boot devices
### ERROR ### Please RESET the board ###

Nothing after that. I doubt the Linux kernel can do anything about that :-( But i wonder why jammy has no problem with that. I;ll run more tests to see if this happens randomly.

I ran ten reboots after the failed reboot and they all ran without a hitch. Next up - power off, then power on, then reboot. I'll probably do that tomorrow.

Update: I cut a new microSD card with oracular and ran a few start/reboot/poweroff cycles. Only after the first start did the reboot hang with the messages I pasted above. All other reboots went without a hitch.

So what is happing in the first boot that triggers the hang? Or is it something the first boot configures that is activated in the second boot? I have no idea how to investigate further, there are just too many possibilities. I don't think the hardware has a problem.

So what is different between jammy and all subsequent releases?!?

I'll open a ticket on launchpad for this as well - but it may be a problem with the way you build the images.

LupeChristoph commented 1 day ago

Launchpad ticket #2090788 On a NanoPi R4S, oracular fails the first reboot

LupeChristoph commented 1 day ago

I just checked if a normal power down would also fix the reboot problem, and indeed it does. After a power down , the reboot works.

LupeChristoph commented 1 day ago

I just checked my old copy of oracular and that is permamently stuck with the reboot problem. Strange. I'll keep this card for further investigations and bring a new copy of oracular up. Fortunately I wasn't that far into setting up...

johang commented 1 day ago

Can you please post a full log of a reboot when it succeeds?

LupeChristoph commented 1 day ago

Here it is: wart-reboot.log

LupeChristoph commented 7 hours ago

Launchpad ticket was closed as invalid. I did not expect much from it, but I tried anyway.

The NanoPi is not officially supported on Ubuntu, as you are using a custom build. I would advise to report the bug to your image distributor. Marking as invalid. Since the error is mmc_load_image_raw_sector: mmc block read error from uboot it might be related to uboot anyways.