felixonmars / archriscv-packages

Modified Arch Linux packages for archriscv
https://archriscv.felixc.at
GNU General Public License v3.0
209 stars 76 forks source link

Can not load fallback initramfs on nezha #2680

Closed inochisa closed 2 months ago

inochisa commented 1 year ago

I try to setup the latest linux package on my nezha board. As I setup the rootfs on x86, so I use fallback initramfs to boot the system. But the u-boot seems to fail to allocate data.

Is there something I left?

The uboot I use is from here with mainline opensbi.

Boot log:

Hit any key to stop autoboot:  0 
=> setenv kernel_comp_addr_r 0x50000000
=> setenv kernel_comp_size   0x04000000
=> setenv fdt_addr_r    0x43000000
=> setenv kernel_addr_r 0x41000000
=> setenv ramdisk_addr_r 0x44000000
=> 
=> run distro_bootcmd
PLL reg = 0xf8216300, freq = 1200000000
switch to partitions #0, OK
mmc0 is current device
Scanning mmc 0:1...
Found /extlinux/extlinux.conf
Retrieving file: /extlinux/extlinux.conf
2:      Arch-fallback
Retrieving file: /initramfs-linux-fallback.img
Retrieving file: /vmlinuz-linux
append: console=ttyS0,115200 console=sbi ignore_loglevel rw root=UUID=604bf002-e34b-4fb1-9020-4263bbc36ac9 rootwait
Retrieving file: /dtbs/allwinner/sun20i-d1-nezha.dtb
   Uncompressing Kernel Image
Moving Image from 0x41000000 to 0x40200000, end=41ea9000
## Flattened Device Tree blob at 43000000
   Booting using the fdt blob at 0x43000000
Working FDT set to 43000000
ERROR: Failed to allocate 0x4a5ab77 bytes below 0x42e00000.
ramdisk - allocation error

extlinux.conf

default Arch-fallback

label Arch-fallback
    linux /vmlinuz-linux
    initrd /initramfs-linux-fallback.img
    devicetree /dtbs/allwinner/sun20i-d1-nezha.dtb
    append console=ttyS0,115200 console=sbi ignore_loglevel rw root=UUID=604bf002-e34b-4fb1-9020-4263bbc36ac9 rootwait
CoelacanthusHex commented 1 year ago

I think the address space allocated to ramdisk is not enough to store ramdisk. @felixonmars using the configs below boot into kernel, but failed to use initrd, we still try to figure out why.

kernel_comp_addr_r=0x44000000
kernel_comp_size=0xa000000
kernel_addr_r=0x41000000

Edited: Just these configs, doesn't set others to use default value.

inochisa commented 1 year ago

I think the address space allocated to ramdisk is not enough to store ramdisk. @felixonmars using the configs below boot into kernel, but failed to use initrd, we still try to figure out why.

I am not sure why this occurs, as I already enlarged the ramfs size. Maybe hardcode address?

kernel_comp_addr_r=0x44000000
kernel_comp_size=0xa000000
kernel_addr_r=0x41000000

Edited: Just these configs, doesn't set others to use default value.

OK, I will try this.

inochisa commented 1 year ago

Results:

Not boot, may no built-in sd driver?

kernel_comp_addr_r=0x44000000
kernel_comp_size=0xa000000
kernel_addr_r=0x41000000

only use this gives me a not found fdt.

=> run distro_bootcmd
PLL reg = 0xf8216300, freq = 1200000000
switch to partitions #0, OK
mmc0 is current device
Scanning mmc 0:1...
Found /extlinux/extlinux.conf
Retrieving file: /extlinux/extlinux.conf
2:      Arch-fallback
Retrieving file: /vmlinuz-linux
append: console=ttyS0,115200 console=sbi ignore_loglevel rw root=UUID=604bf002-e34b-4fb1-9020-4263bbc36ac9 rootwait
Retrieving file: /dtbs/allwinner/sun20i-d1-nezha.dtb
   Uncompressing Kernel Image
Moving Image from 0x41000000 to 0x40200000, end=41ea9000
ERROR: Did not find a cmdline Flattened Device Tree
Could not find a valid device tree

If set fdt, this kernel with stuck on the starting kernel. It seems the rootfs not found.

=> setenv kernel_comp_addr_r 0x44000000
=> setenv kernel_comp_size   0x0a000000
=> setenv kernel_addr_r      0x41000000
=> setenv fdt_addr_r         0x43000000
=> run distro_bootcmd
PLL reg = 0xf8216300, freq = 1200000000
switch to partitions #0, OK
mmc0 is current device
Scanning mmc 0:1...
Found /extlinux/extlinux.conf
Retrieving file: /extlinux/extlinux.conf
2:      Arch-fallback
Retrieving file: /vmlinuz-linux
append: console=ttyS0,115200 console=sbi ignore_loglevel rw root=UUID=604bf002-e34b-4fb1-9020-4263bbc36ac9 rootwait
Retrieving file: /dtbs/allwinner/sun20i-d1-nezha.dtb
   Uncompressing Kernel Image
Moving Image from 0x41000000 to 0x40200000, end=41ea9000
## Flattened Device Tree blob at 43000000
   Booting using the fdt blob at 0x43000000
Working FDT set to 43000000
   Loading Device Tree to 0000000042df8000, end 0000000042dff63c ... OK
Working FDT set to 42df8000

Starting kernel ...

extlinux.conf

default Arch-fallback

label Arch-fallback
    linux /vmlinuz-linux
    devicetree /dtbs/allwinner/sun20i-d1-nezha.dtb
    append console=ttyS0,115200 console=sbi ignore_loglevel rw root=UUID=604bf002-e34b-4fb1-9020-4263bbc36ac9 rootwait
inochisa commented 1 year ago

@CoelacanthusHex

Now It boot into kernel, but not boot into rootfs, Here is the boot.log

Maybe need to set CONFIG_MMC_SUNXI to Y (now M) to get builtin mmc support?

inochisa commented 1 year ago

@CoelacanthusHex

Any progress?

I have confirmed the 6.5-rc4 kernel can boot. The kernel is just built with config defconfig.

felixonmars commented 1 year ago

I have tried to debug a bit. The stuck-at-starting-kernel issue is probably indeed a too-large initramfs:

image

But after I tried to update u-boot as suggested, I could never make it past u-boot. I don't really have an idea now and the process is really frustrating. Our kernel has many more options (drivers) enabled in additional to defconfig, so it's not really comparable.

If nezha turns out to need a separate kernel after all, I'm afraid I cannot really support it with the time & resources I have...

hyx0329 commented 1 year ago

I've applied changes below to u-boot(d1-wip), but I still cannot boot successfully via extlinux.

env set bootm_size        0x0a000000
env set kernel_addr_r     0x42000000
env set fdt_addr_r        0x43000000
env set script_addr_r     0x43100000
env set pxefile_addr_r    0x43200000
env set fdtoverlay_addr_r 0x43300000
env set ramdisk_addr_r    0x43400000

However EFI boot with GRUB is improving. Originally(without changes) the boot process will get stuck at loader/efi/linux.c:357:linux: Providing initrd via EFI_LOAD_FILE2_PROTOCOL(it's GRUB). After applying those changes, the kernel will reach clk: Disabling unused clocks and then get stuck.

I applied clk_ignore_unused to the cmdline to workaround, and here is the log I got when booting with standard initramfs(I regenerated it with a custom kernel): boot-log-since-grub.txt

When sunxi_mmc is included in the initramfs: boot-log-since-grub-with-sunxi-mmc.txt

Maybe there are bugs to fix in both u-boot and linux.

PS. my hardware is MQ Pro. The decompressed kernel image is about 31MB.

inochisa commented 1 year ago

@felixonmars @hyx0329

Apply clk_ignore_unused did make the kernel run further, I now got when using modified u-boot with initrd_high support

[   66.638599] Freeing initrd memory: 76732K
[   66.820343] Segment Routing with IPv6
[   66.825140] RPL Segment Routing with IPv6
[   66.829924] In-situ OAM (IOAM) with IPv6
[   66.834774] NET: Registered PF_PACKET protocol family
[   66.888959] registered taskstats version 1
[   66.895284] Loading compiled-in X.509 certificates
[   66.976335] Loaded X.509 cert 'Build time autogenerated kernel key: b8250af3366125fd6643ffbe64881a9a49e770b9'
[   66.991115] zswap: loaded using pool zstd/zsmalloc
[   67.034934] Key type .fscrypt registered
[   67.039517] Key type fscrypt-provisioning registered
[   67.128443] clk: Not disabling unused clocks
[   67.133578] Warning: unable to open an initial console.
[   67.148008] Freeing unused kernel image (initmem) memory: 5860K
[   67.158582] Checked W+X mappings: passed, no W+X pages found
[   67.164925] rodata_test: all tests were successful
[   67.170403] Run /init as init process
[   67.174541]   with arguments:
[   67.177890]     /init
[   67.180473]   with environment:
[   67.184008]     HOME=/
[   67.186684]     TERM=linux
[   78.453822] sun8i-mixer 5100000.mixer: deferred probe timeout, ignoring dependency
[   78.463768] sun8i-mixer 5200000.mixer: deferred probe timeout, ignoring dependency
[   78.478836] gpio gpiochip0: Static allocation of GPIO base is deprecated, use dynamic allocation.
[   78.518584] sun20i-d1-pinctrl 2000000.pinctrl: initialized sunXi PIO driver
[   78.531328] sun20i-d1-pinctrl 2000000.pinctrl: request() failed for pin 160
[   78.541169] sun20i-d1-pinctrl 2000000.pinctrl: request() failed for pin 192
[   78.551050] sun20i-d1-pinctrl 2000000.pinctrl: request() failed for pin 32
[   78.559170] sun20i-d1-pinctrl 2000000.pinctrl: pin-160 (4020000.mmc) status -517
[   78.567474] sun20i-d1-pinctrl 2000000.pinctrl: pin-192 (4021000.mmc) status -517
[   78.576015] sun20i-d1-pinctrl 2000000.pinctrl: pin-32 (2502800.i2c) status -517
[   78.584200] sun20i-d1-pinctrl 2000000.pinctrl: could not request pin 160 (PF0) from group PF0  on device 2000000.pinctrl
[   78.596287] sun20i-d1-pinctrl 2000000.pinctrl: could not request pin 192 (PG0) from group PG0  on device 2000000.pinctrl
[   78.608371] sun20i-d1-pinctrl 2000000.pinctrl: could not request pin 32 (PB0) from group PB0  on device 2000000.pinctrl
[   78.620357] sunxi-mmc 4020000.mmc: Error applying setting, reverse things back
[   78.628424] sunxi-mmc 4021000.mmc: Error applying setting, reverse things back
[   78.636492] mv64xxx_i2c 2502800.i2c: Error applying setting, reverse things back
[   78.645609] sun20i-d1-pinctrl 2000000.pinctrl: request() failed for pin 117
[   78.653392] sun20i-d1-pinctrl 2000000.pinctrl: pin-117 (2000000.pinctrl:117) status -517
[   78.662405] sun4i-usb-phy 4100400.phy: Couldn't request ID GPIO
[   78.669893] platform 2009800.keys: deferred probe pending
[   78.676121] platform 4020000.mmc: deferred probe pending
[   78.682059] platform 4021000.mmc: deferred probe pending
[   78.687985] platform 2502800.i2c: deferred probe pending
[   78.694022] platform 4100400.phy: deferred probe pending
[   93.228393] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
[   93.236838] CPU: 0 PID: 1 Comm: switch_root Not tainted 6.4.10-arch1-1.1 #1 a1c4c55014835a4479f131936bbd4507c268917f
[   93.248428] Hardware name: Allwinner D1 Nezha (DT)
[   93.253699] Call Trace:
[   93.256397] [<ffffffff80005fca>] dump_backtrace+0x1c/0x24
[   93.262381] [<ffffffff809e64e0>] show_stack+0x2c/0x38
[   93.267988] [<ffffffff809f1fa6>] dump_stack_lvl+0x3c/0x54
[   93.273959] [<ffffffff809f1fd2>] dump_stack+0x14/0x1c
[   93.279541] [<ffffffff809e6734>] panic+0x102/0x2be
[   93.284851] [<ffffffff8001f84c>] do_exit+0x886/0x88c
[   93.290346] [<ffffffff8001f9d6>] do_group_exit+0x28/0x74
[   93.296218] [<ffffffff8001fa3a>] __wake_up_parent+0x0/0x24
[   93.302281] [<ffffffff809f29e6>] do_trap_ecall_u+0xde/0xf2
[   93.308344] [<ffffffff80003e6c>] ret_from_exception+0x0/0x64
[   93.314624] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100 ]---

There is still a issue related the initramfs. The fallback initramfs is about 200MB and some of it is freed. I will test more when I have time.

hyx0329 commented 1 year ago

Now it's booting from fallback initramfs to rootfs with these modules in mkinitcpio.conf:

MODULES=(
  sunxi-mmc sdhci sdhci-pltfm mmc_block mtd fixed
)

I know some of them may not required, but Now the list is almost stable. I just want to announce first. I think MMC and SDHCI related drivers should be loaded very early.

Drivers that likely need to be loaded early:

Mods to u-boot:

env set bootm_size         0x0a000000
env set kernel_addr_r      0x42000000
env set kernel_comp_addr_r 0x48000000
env set kernel_comp_size   0x0b000000
env set fdt_addr_r         0x4fa00000
env set script_addr_r      0x4fc00000
env set pxefile_addr_r     0x4fd00000
env set fdtoverlay_addr_r  0x4fe00000
env set ramdisk_addr_r     0x4ff00000

Kernel cmdline:

earlyprintk=uart,mmio32,0x02500000 earlycon console=ttyS0,115200 loglevel=8 root=UUID=07263c70-706f-4bff-b607-bb2b157fbbf4 LANG=en_US.UTF-8 rootwait clk_ignore_unused rootdelay=3

I haven't got a usable console yet. Besides I observe no HDMI output for MQ Pro.

Attachments:

Edit: forgot to mention that I'm still using EFI boot/GRUB. Edit: remove (most of) unnecessary modules Edit: add missed necessary module(fixed) back

felixonmars commented 1 year ago

Thanks for the investigation. I have successfully booted my D1 with your mentioned modules added to the MODULES array.

It seems only the fixed regulator module was missing from the generated fallback initrd img. I have opened https://gitlab.archlinux.org/archlinux/mkinitcpio/mkinitcpio/-/merge_requests/270 to include it.

It remains a mystery to me why aren't they loaded automatically given that the u-boot shipped DT nodes clearly mention relevant compatible names. This makes the changes to the MODULES array permanently needed even on subsequentially generating the initrd on the board itself.