anduril / jetpack-nixos

NixOS module for NVIDIA Jetson devices
MIT License
160 stars 69 forks source link

Broken BIOS at Orin AGX after flashing #229

Closed heshdotcc closed 4 months ago

heshdotcc commented 4 months ago

First, thank you so much for your dedication to bringing Nix/OS to the Jetson ecosystem!

Given the README instructions, I've effectively (I think) flashed an Orin AGX with an output tail:

[ 453.5772 ] Writing partition A_MEM_BCT with mem_coldboot_sigheader.bct.encrypt [ 243712 bytes ]
[ 453.5776 ] [................................................] 100%
[ 456.5801 ] tegradevflash_v2 --write B_MEM_BCT mem_coldboot_sigheader.bct.encrypt
[ 456.5807 ] Bootloader version 01.00.0000
[ 456.6825 ] Writing partition B_MEM_BCT with mem_coldboot_sigheader.bct.encrypt [ 243712 bytes ]
[ 456.6829 ] [................................................] 100%
[ 459.6847 ] Flashing completed

[ 459.6848 ] Coldbooting the device
[ 459.6856 ] tegrarcm_v2 --chip 0x23 0 --ismb2
[ 459.6861 ] MB2 version 01.00.0000
[ 459.7876 ] Coldbooting the device
[ 459.7884 ] tegrarcm_v2 --chip 0x23 0 --reboot coldboot
[ 459.7889 ] MB2 version 01.00.0000
*** The target t186ref has been flashed successfully. ***
Reset the board to boot from internal eMMC.

Now, I can't access the BIOS (by pressing Esc), and the boot no longer has an Nvidia or any other logo. Is it expected?

Curiously, if I put the built minimal ISO, it boots NixOS with working video, peripherals, and ethernet (I didn't test WiFi).

So, I've successfully installed NixOS, as it's traditionally done, while following README specifics for this SOM:

However, a boot loop after flashing makes the new NixOS installation on the SSD unbootable: it defaults to the live ISO.

I tried to change the boot order through efibootmgr to other options, but the BIOS seems to ignore it completely.

Note: I've bought this Jetson from a reseller, so I wonder if they flashed it at all previously. It came with Ubuntu.

Any help is greatly appreciated, as I look forward to documenting the process. Thank you!

heshdotcc commented 4 months ago

I just tried to re-flash this Orin AGX but now using the initrd package w/o success:

./result/bin/initrd-flash-orin-agx-devkit

################################
# L4T BSP Information:
# R35 , REVISION: 4.1
###############################################################################
# Target Board Information:
# Name: jetson-agx-orin-devkit, Board Family: t186ref, SoC: Tegra 234,
# OpMode: production, Boot Authentication: NS,
# Disk encryption: disabled ,
###############################################################################
...
[   0.4815 ] Sending bct_br
[   0.7888 ] ERROR: might be timeout in USB write.
Error: Return value 3
Command tegrarcm_v2 --new_session --chip 0x23 0 --uid --download bct_br br_bct_BR.bct --download mb1 mb1_t234_prod_aligned_sigheader.bin.encrypt --download psc_bl1 psc_bl1_t234_prod_aligned_sigheader.bin.encrypt --download bct_mb1 mb1_bct_MB1_sigheader.bct.encrypt
Reading board information failed.
heshdotcc commented 4 months ago

I tried to re-run the initial flash package flash-orin-agx-devkit (the one that worked) with the same output as above.

Then, I switched from this USB-A-to-USB-C cable to a USB-C-to-USB-C cable: that made the trick.

Now it seems that the initrd-flash-orin-agx-devkit package succeeded this time, and surprisingly fast:

Here's a gist with the complete output of this build. If it helps in anything, some errors may be of interest.

After hooking into the serial console, I see the same Ubuntu 20.04.6 that it came with.

Here's the output of efibootmgr from within the Ubuntu, for some strange reason, it decided to boot the eMMC instead:

BootCurrent: 0002
Timeout: 5 seconds
BootOrder: 0001,0002,0000,0003,0004,0005,0006,0007,0008
Boot0000* Enter Setup
Boot0001* UEFI Samsung SSD 970 EVO Plus 250GB
Boot0002* UEFI eMMC Device
Boot0003* UEFI PXEv4 (MAC:REDACTED)
Boot0004* UEFI PXEv6 (MAC:REDACTED)
Boot0005* UEFI HTTPv4 (MAC:REDACTED)
Boot0006* UEFI HTTPv6 (MAC:REDACTED)
Boot0007* BootManagerMenuApp
Boot0008* UEFI Shell

Or maybe that's the default behavior after flashing.

I'm still unable to access the BIOS menu or see anything besides a kernel boot log from this Ubuntu...

So I'll try to reformat with ext4 instead of f2fs and then report here.

heshdotcc commented 4 months ago

I've reinstalled NixOS on the SSD using ext4 instead of f2fs and setting efibootmgr for the new boot option.

After that, once the device powers on, I only see a kernel boot log and a black screen with no video.

I've attached it to the serial console, but the connection was to the Ubuntu on the eMMC storage.

heshdotcc commented 4 months ago

I've decided to close this issue and start a more concise one here: https://github.com/anduril/jetpack-nixos/issues/230

I hope these comments help anyone going through the same thing. If not, please feel free to delete it. Thanks!