openbmc / u-boot

OpenBMC "Das U-Boot" Source Tree
13 stars 51 forks source link

kernel panic on host power on #11

Open bradbishop opened 8 years ago

bradbishop commented 8 years ago

I opened this against u-boot because the kernel has not changed. After moving to 2016.05 there is a kernel panic while booting the host:

Unable to handle kernel NULL pointer dereference at virtual address 00000084
pgd = de160000
[00000084] *pgd=5a1fc831, *pte=00000000, *ppte=00000000p

its not much but that is all I get on the console

Meanwhile here is where we are in the boot (from the host console):

[    8.738141] [drm] Initialized drm 1.1.0 20060810
[    8.738234] [drm] radeon kernel modesetting enabled.
[    8.738392] ast 0001:04:00.0: enabling device (0140 -> 0142)
[    8.738598] [drm] platform has no IO space, trying MMIO
[    8.738688] [drm] AST 2400 detected
[    8.738770] [drm] VGA not enabled on entry, requesting chip POST
[    8.738883] [drm] Analog VGA only
[    8.738959] [drm] dram 1632000000 7 16 00c00000
[    8.739090] [TTM] Zone  kernel: Available graphics memory: 33401760 kiB
[    8.739212] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[    8.739318] [TTM] Initializing pool allocator

I was using https://github.com/openbmc/openbmc/commit/856271db10ba2a1659ed5987953d2ab4b8d7c60d

shenki commented 8 years ago

I suspect this is due to my removal of the VGA setup from u-boot:

https://github.com/openbmc/u-boot/commit/6d9e2879414b67e193b643cd8f684f793abdf8e2#diff-84ea64abf02622ef709560b072e47382L83

I deleted this as u-boot wasn't using VGA. Oops.

I will test this tory by doing the equivalent setup in the kernel.

Seperaretly, we should fix the host so it does not crash when the BMC has done something it does not expect.

Finally, as we discussed on Thursday, we should work out a way for the BMC to communicate to the host kernel that the VGA device is not present and should not be touched. @jk-ozlabs suggested removing the device tree node for the vga device before the kenrel sees it.

mdmillerii commented 8 years ago

The vga node is discovered over the PCIe bus and not via the host device tree. (there are security bits that can disable the vga device from the host).

I believe the issue here is the BMC crashes when the host boots, around the time the VGA driver is initializing.

The driver uses some scratch registers to find out how much BMC ram to use for the vga device. I suspect this memory was not reserved from the BMC and the host stomped on the BMC kernel's memory.

shenki commented 8 years ago

I mistakenly thought the null pointer dereference was on the host.

You're correct @mdmillerii, we don't reserve any memory for the vga device in OpenBMC