inindev / odroid-m1

debian arm64 linux for the odroid m1
GNU General Public License v3.0
15 stars 2 forks source link

Debian refuses to boot from SSD after longer power off #5

Closed mkkot closed 1 year ago

mkkot commented 1 year ago

I went for the vacation. When I turned my M1 on again after coming back, it didn't boot. I can only see blinking or not blinking "_" character when HDMI is connected. The board itself and SPI is most likely not broken as it still boots from SD card. The RTC battery was inserted just right after I bought it (which was just like a month or 2 ago).

Found this thread: https://forum.odroid.com/viewtopic.php?f=216&t=46130

They suggest to add pcie_aspm=off to kernel command line parameters. I was struggling a bit but finally found this: https://github.com/inindev/odroid-m1/blob/main/debian/make_debian_img.sh#L138

So I started a system from SD card, mounted the nvme as chroot-debian and run:

mount -o bind /dev chroot-debian/dev
mount -o bind /proc chroot-debian/proc
chroot chroot-debian /bin/bash

changed the kernel parameters in /boot/boot.txt and issued the command from the comment at the top of the file (I think it was /boot/mkscr.sh?)

So the result is the same: blinking or not blinking "_", depends how it decides to boot this time. Blue LED is blinking, network interface is blinking, I can hardly see any activity of nvme's green led.

I could see this behavior in the past, but in the past I managed just to restart the board and it worked. Now it looks quite dead. Any ideas?

inindev commented 1 year ago

Can you put a serial console on it to see what the error is? If you have petitboot on the spi flash and it boots from MMC when you hold the mask button, I would guess it is some issue with petitboot.

On 7/23/23 2:57 PM, mkkot wrote:

I went for the vacation. When I turned my M1 on again after coming back, it didn't boot. I can only see blinking or not blinking "_" character when HDMI is connected. The board itself and SPI is most likely not broken as it still boots from SD card.

Found this thread: https://forum.odroid.com/viewtopic.php?f=216&t=46130

They suggest to add pcie_aspm=off to kernel command line parameters. I was struggling a bit but finally found this: https://github.com/inindev/odroid-m1/blob/main/debian/make_debian_img.sh#L138

So I started a system from SD card, mounted the nvme as chroot-debian and run:

mount -o bind /dev chroot-debian/dev mount -o bind /proc chroot-debian/proc chroot chroot-debian /bin/bash

changed the kernel parameters in /boot/boot.txt and issued the command from the comment at the top of the file (I think it was /boot/mkscr.sh?)

So the result is the same: blinking or not blinking "_", depends how it decides to boot this time. Blue LED is blinking, network interface is blinking, I can hardly see any activity of nvme's green led.

I could see this behavior in the past, but in the past I managed just to restart the board and it worked. Now it looks quite dead. Any ideas?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

mkkot commented 1 year ago

Hey @inindev, thanks for your input!

Unfortunately, I don't have the hardware for serial console. I guess this is what I need? https://wiki.odroid.com/accessory/development/usb_uart_kit

Anyway, I came up to an idea that I could chroot again and run journalctl to see if it did boot at all. And guess what, it did! I can see network configured and some silly message about Plymouth not started. But then the time gets synced from NTP server, so most likely this is not critical. The machine responds to ping (ssh got broken for unknown reason just before I went to vacation, so can't check that one). Still, that is super weird as I should see the login prompt through HDMI. Will keep investigating.

BTW, using this: https://github.com/inindev/odroid-m1/tree/main/uboot#booting-from-spi-nor-flash

So no petitboot.

inindev commented 1 year ago

my favorite serial adapter this one: https://www.amazon.com/gp/product/B09W2B61HW it has a real FTDI chip so it is very reliable

for the HDMI console issue, add console=tty1 to the command line of boot.txt then run ./mkscr.sh

before:

setenv bootargs console=ttyS2,1500000 root=PARTUUID=${uuid} rw rootwait ipv6.disable=1 earlycon=uart8250,mmio32,0xfe660000

after:

setenv bootargs console=tty1 console=ttyS2,1500000 root=PARTUUID=${uuid} rw rootwait ipv6.disable=1 earlycon=uart8250,mmio32,0xfe660000
mkkot commented 1 year ago

I managed to find the issue by reading journalctl. It was really dumb. So I have this USB drive that I put into /etc/fstab and forgot about nofail option. The drive went with me for the vacation and then I didn't connect it (thinking it's a REMOVABLE drive). Systemd complained about the missing drive and started an emergency shell (this is where it tries to run Plymouth that I saw in the logs earlier).

But I haven't seen any of this and couldn't use any emergency shell, since you changed the tty. So that was super confusing and I thought that the system doesn't boot.

Please kindly consider using tty1 for the release version, that would save me few hours today ;)

Said that, I'm really happy with my Debian system and I keep building sshfs file sharing / borg backup / pxe boot / print server and what not on it. And I learned something today ;)

inindev commented 1 year ago

u-boot was trying to boot a USB device, issue was resolved