loongson / Firmware

Firmware Of LoongArch Machines
86 stars 21 forks source link

LS2C5LE: 启动时触发 kernel panic, 导致无法进入系统 #84

Open KFERMercer opened 9 months ago

KFERMercer commented 9 months ago

问题描述: 从 iso 启动时会触发 kernel panic, 导致内核无法加载. (由于无法成功安装系统, 无法确认启动本地系统时是否触发)

主板型号: LS2C5LE (大别山 T2218A 双路 3C5000L 7A1000)

image

bios版本: Loongson-UDK2018-V2.0.0-prebeta9

内存条安装插槽: A1 B1 A3 B3 各一条紫光国芯 16g 3200

测试iso:

  1. archlinux-livecd-2023.11.29-loong64.iso
  2. debian12-live-gnome-loong64-20230629.iso
  3. proxmox-ve_8.1-3-loong64.iso

内核日志:

[    0.000000] Kernel panic - not syncing: Failed to allocate 481111834624 bytes for node 0 memory map
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.4.0-0-loong64 #1  Debian 6.4.0-1
[    0.000000] Hardware name: LOONGSON LOONGSON T100 T2208A/Loongson-LS2C5LE, BIOS Loongson-UDK2018-V2.0.0-prebeta9 11/01/2022
[    0.000000] Stack : 9000000001067bb8 0000000000000000 90000000002241f8 9000000001064000
[    0.000000]         9000000001067ad0 9000000001067ad8 0000000000000000 0000000000000000
[    0.000000]         9000000001067ad8 0000000000000000 0000000000000000 0000000000000000
[    0.000000]         0000000000000001 9000000001067ad8 0000000000000000 0000000000000000
[    0.000000]         0000000000017fe8 0000000000000000 90000000010fb2d8 0000000000000000
[    0.000000]         0000000000000000 0000000000000000 0000000000000000 0000000000000080
[    0.000000]         0000000000000000 0000000000000000 9000000000f737a0 9000000001084000
[    0.000000]         0000007004800000 00000000007ff974 90000000011b29c0 00000001c0120000
[    0.000000]         0000000000000000 0000000000000000 9000000000224210 0000000000000000
[    0.000000]         00000000000000b0 0000000000000004 0000000000000000 0000000000070800
[    0.000000]         ...
[    0.000000] Call Trace:
[    0.000000] [<9000000000224210>] show_stack+0x64/0x188
[    0.000000] [<9000000000c5e720>] dump_stack_lvl+0x60/0x88
[    0.000000] [<9000000000c52100>] panic+0x13c/0x30c
[    0.000000] [<9000000000c9628c>] free_area_init+0x974/0xd5c
[    0.000000] [<9000000000c857a0>] paging_init+0x40/0x60
[    0.000000] [<9000000000c8089c>] start_kernel+0xa4/0x658
[    0.000000] [<9000000000c610b4>] kernel_entry+0xb4/0xb8
[    0.000000] 
[    0.000000] ---[ end Kernel panic - not syncing: Failed to allocate 481111834624 bytes for node 0 memory map ]---

根据上文日志信息, 拔掉属于 node 0 的 A1 插槽内的内存重启, 却无法通过自检

自检串口输出:

Shut down slave cores done!

UEFI-EDK2 loongson Initializing...

CPU Version: 00000043
CPU Type: 00000005
CPU SRAM: 0005000180158081
Node 00000000
N Voltage  write :
v ctrl end

N Voltage  read :
0000047a

P Voltage write :
io ctrl end

Node 00000004
N Voltage  write :
v ctrl end

N Voltage  read :
0000047a

P Voltage write :
io ctrl end

(UNCACHED_MEMORY_ADDR | 0x1fe00190)  : 0000c68200000000
CPU CLK SEL : 00000002
MEM CLK SEL : 00000014
HT CLK SEL : 00000031
Change the scale of HT0 clock
Change the scale of HT1 clock
Change the scale of HT2/HT3 clock
Change the scale of HT1 clock
Soft CLK SEL adjust begin
CORE & NODE:08110f85
HT :00000000Change the scale of HT0 clock
Change the scale of HT1 clock
Change the scale of HT2/HT3 clock
Change the scale of HT1 clock
Soft CLK SEL adjust begin
CORE & NODE:08110f85
HT :00000000Change the scale of HT0 clock
Change the scale of HT1 clock
Change the scale of HT2/HT3 clock
Change the scale of HT1 clock
Soft CLK SEL adjust begin
CORE & NODE:08110f85
HT :00000000Change the scale of HT0 clock
Change the scale of HT1 clock
Change the scale of HT2/HT3 clock
Change the scale of HT1 clock
Soft CLK SEL adjust begin
CORE & NODE:08110f85
HT :00000000Change the scale of HT0 clock
Change the scale of HT1 clock
Change the scale of HT2/HT3 clock
Change the scale of HT1 clock
Soft CLK SEL adjust begin
CORE & NODE:08110f85
HT :00000000Change the scale of HT0 clock
Change the scale of HT1 clock
Change the scale of HT2/HT3 clock
Change the scale of HT1 clock
Soft CLK SEL adjust begin
CORE & NODE:08110f85
HT :00000000Change the scale of HT0 clock
Change the scale of HT1 clock
Change the scale of HT2/HT3 clock
Change the scale of HT1 clock
Soft CLK SEL adjust begin
CORE & NODE:08110f85
HT :00000000Change the scale of HT0 clock
Change the scale of HT1 clock
Change the scale of HT2/HT3 clock
Change the scale of HT1 clock
Soft CLK SEL adjust begin
CORE & NODE:08110f85
HT :00000000Loongson3Ht1Addr32Trans End
Copy Sec code to SCache-As-Ram.
Copy PEI code to SCache-As-Ram.
Init I2c to slave mode begin...
Init I2c to slave mode end  ...
Entering C environment
&SecCoreData.DataSize=9027FFA0 SecCoreData.DataSize=48
&SecCoreData.TemporaryRamSize=9027FFC0 SecCoreData.TemporaryRamSize=300000
&SecCoreData.TemporaryRamBase=9027FFB8 SecCoreData.TemporaryRamBase=90100000
&SecCoreData.PeiTemporaryRamBase=9027FFC8 SecCoreData.PeiTemporaryRamBase=90100000
&SecCoreData.PeiTemporaryRamSize=9027FFD0 SecCoreData.PeiTemporaryRamSize=100000
&SecCoreData.StackBase=9027FFD8 SecCoreData.StackBase=90200000
&SecCoreData.StackSize=9027FFE0 SecCoreData.StackSize=200000
&SecCoreData.BootFirmwareVolumeBase=9027FFA8 SecCoreData.BootFirmwareVolumeBase=90020000
&SecCoreData.BootFirmwareVolumeSize=9027FFB0 SecCoreData.BootFirmwareVolumeSize=E0000
Find Pei EntryPoint=90020360
Register PPI Notify: DCD0BE23-9586-40F4-B643-06522CED4EDE
Install PPI: 8C8CE578-8A3D-4F1C-9935-896185C32DD3
Install PPI: 5473C07A-3DCB-4DCA-BD6F-1E9689E7349A
The 0th FV start address is 0x00090020000, size is 0x000E0000, handle is 0x90020000
Register PPI Notify: 49EDB1C1-BF21-4761-BB12-EB0031AABB39
Register PPI Notify: EA7CA24B-DED5-4DAD-A389-BF827E8F9B38
Install PPI: B9E0ABFE-5979-4914-977F-6DEE78C278A6
Install PPI: DBE23AA9-A345-4B97-85B6-B226F1617389
Loading PEIM at 0x00090035880 EntryPoint=0x00090035AC0 PcdPeim.efi
Install PPI: 06E81C58-4AD7-44BC-8390-F10265F72480
Install PPI: 01F34D25-4DE2-23AD-3FF3-36353FF323F1
Install PPI: 4D8B155B-C059-4C8F-8926-06FD4331DB8A
Install PPI: A60C6B59-E459-425D-9C69-0BCC9CB27D81
Register PPI Notify: 605EA650-C65C-42E1-BA80-91A52AB618C6
Loading PEIM at 0x00090047EA0 EntryPoint=0x000900480E0 StatusCodePei.efi
Install PPI: 229832D3-7A30-4B36-B827-F40CB7D45436
PROGRESS CODE: V03020003 I0
Loading PEIM at 0x000900581A0 EntryPoint=0x000900583E0 PeiVariable.efi
PROGRESS CODE: V03020002 I0
Install PPI: 2AB86EF5-ECB5-4134-B556-3854CA1FE1B4
PROGRESS CODE: V03020003 I0
Loading PEIM at 0x0009005C320 EntryPoint=0x0009005C560 PeiLs3aPlatformTableInit.efi
PROGRESS CODE: V03020002 I0
Install PPI: 8C6E9477-EE4A-ADA0-C380-E9B864554E50
PROGRESS CODE: V03020003 I0
Loading PEIM at 0x000900604A0 EntryPoint=0x000900606E0 PeiLs7aPlatformTableInit.efi
PROGRESS CODE: V03020002 I0
Install PPI: BFCBEA5A-1B9B-E098-08DD-897FE2763205
PROGRESS CODE: V03020003 I0
Loading PEIM at 0x000900647A0 EntryPoint=0x000900649E0 HdaVerbTablePpi.efi
PROGRESS CODE: V03020002 I0
Install PPI: 43DD4E3B-049B-2287-65C5-31C54CF1655C
PROGRESS CODE: V03020003 I0
Loading PEIM at 0x000900685A0 EntryPoint=0x000900687E0 MemConfigPpi.efi
PROGRESS CODE: V03020002 I0
Install PPI: 56A9F5C8-9A19-854C-01A6-1335DE2538FC
PROGRESS CODE: V03020003 I0
Loading PEIM at 0x00090083F20 EntryPoint=0x00090084160 ChipSourcePpi.efi
PROGRESS CODE: V03020002 I0
Install PPI: 56A9F5C8-9A19-854C-01A6-1335DE2549FC
PROGRESS CODE: V03020003 I0
Loading PEIM at 0x00090090720 EntryPoint=0x00090090960 CustomedDevPpi.efi
PROGRESS CODE: V03020002 I0
Install PPI: 56A9F5C8-9A19-854C-01A6-3535DE2538F2
PROGRESS CODE: V03020003 I0
Loading PEIM at 0x00090096F20 EntryPoint=0x00090097160 LsRcCorePpi.efi
PROGRESS CODE: V03020002 I0
Install PPI: 3A5CE085-DE30-4D9D-9673-A3DEDAF1F24E
PROGRESS CODE: V03020003 I0
Loading PEIM at 0x0009009AD20 EntryPoint=0x0009009AF60 CustomedPpi.efi
PROGRESS CODE: V03020002 I0
Beep Function Register Success.
WatchDog Function Register Success.
Voltage Function Register Success.
SmartFan Function Register Success.
DebugSwitch Init Fun Register Success.
PROGRESS CODE: V03020003 I0
Loading PEIM at 0x0009009EF00 EntryPoint=0x0009009F140 PlatformInitPei.efi
PROGRESS CODE: V03020002 I0

***** LsRcCorePpi                Version v1.0 *****

***** LsRcCorePpi/ChipCfgPpi     Version v1.0 *****

***** LsRcCorePpi/CustomedCfgPpi Version v0.2 *****

HT configuration!
- conf flag - 0
HT configure done!
reset and dereset x between 0 and 3
>>>>>>>>>>>>>>0x44 20
enable x between 0 and 3
reset node 3 ring ht hi
reset node 3 ring ht lo
reset node 0 ring ht hi
reset node 0 ring ht lo
wait ring ht down
>0x44 11000010
>0x44 11000010
>0x44 11000010
>0x44 11000010
dereset ring ht
Waiting node0 HT0_LO bus to be up.
>0x44 11000020
Waiting node0 HT0_HI bus to be up.
>0x44 11000020
Waiting node1 HT0_LO bus to be up.
>0x44 11000020
Waiting node1 HT0_HI bus to be up.
>0x44 11000020
Waiting node2 HT0_LO bus to be up.
>0x44 11000020
Waiting node2 HT0_HI bus to be up.
>0x44 11000020
Waiting node3 HT0_LO bus to be up.
>0x44 11000020
Waiting node3 HT0_HI bus to be up.
>0x44 11000020
reset and dereset x between 4 and 7
>>>>>>>>>>>>>>0x44 20
enable x between 4 and 7
reset node 7 ring ht hi
reset node 7 ring ht lo
reset node 4 ring ht hi
reset node 4 ring ht lo
wait ring ht down
>0x44 11000010
>0x44 11000010
>0x44 11000010
>0x44 11000010
dereset ring ht
Waiting node4 HT0_LO bus to be up.
>0x44 11000020
Waiting node4 HT0_HI bus to be up.
>0x44 11000020
Waiting node5 HT0_LO bus to be up.
>0x44 11000020
Waiting node5 HT0_HI bus to be up.
>0x44 11000020
Waiting node6 HT0_LO bus to be up.
>0x44 11000020
Waiting node6 HT0_HI bus to be up.
>0x44 11000020
Waiting node7 HT0_LO bus to be up.
>0x44 11000020
Waiting node7 HT0_HI bus to be up.
>0x44 11000020
20
Waiting node1 E HyperTransport bus to be up.
>0x44 20
Waiting node4 E HyperTransport bus to be up.
>0x44 20
20
Waiting node2 E HyperTransport bus to be up.
>0x44 20
Waiting node7 E HyperTransport bus to be up.
>0x44 20
20
Waiting node3 E HyperTransport bus to be up.
>0x44 20
Waiting node6 E HyperTransport bus to be up.
>0x44 20
Reset node5 HT1-LO bus
20
Ls7A1000 Hyper Transport Bridge Initialize.
set LS7A1000 MISC and confbus base address done.
clksel= C682
Warning: CPU HT in hard freq mode
7A HT in soft freq cfg mode
Set 7A1000 side HT:
Set width 0x20
Set freq 0x82251060
Set soft config 0x8A810A
Set GEN3 mode 0x81237008
Set retry mode 0x81
Enable scrambling 0x78
Set buffer num 0xFFFFFFF
Set CPU side HT:
Set width 0x20
Set freq 0x2251060
Set soft config 0x1C40144A
Set GEN3 mode 0x81237008
Set retry mode 0x81
Enable scrambling 0x78
Set buffer num 0xFFFFFFF
config ht link done.
Reset NODE0 HT1-lo bus
Dereset NODE0 HT1-LO BUS
after 0x0
20
>0x44 20
Checking NODE0 HT1 LO CRC error done!
PLL check success.
Checking Bridge HT CRC error bit.
>Waiting node0 F HyperTransport bus to be up.
>0x44 20
Waiting node3 F HyperTransport bus to be up.
>0x44 20
Waiting node1 F HyperTransport bus to be up.
>0x44 20
Waiting node2 F HyperTransport bus to be up.
>0x44 20
Waiting node4 F HyperTransport bus to be up.
>0x44 20
Waiting node7 F HyperTransport bus to be up.
>0x44 20
Waiting node5 F HyperTransport bus to be up.
>0x44 20
Waiting node6 F HyperTransport bus to be up.
>0x44 20
LS3A-7A linkup.
FsbConnect Success...

***** MemConfigPpi Version v2.5 *****

erase flash 0x800000001c038000 for store mc bit map
Erase end!
check node 0 mc0 slot0
NO DIMM in this slot.
check node 0 mc0 slot1
NO DIMM in this slot.
final tCKmax 0
No dimm on node 0
start up other CPU
KFERMercer commented 9 months ago

此故障实际上导致 大别山T2218A 服务器无法安装任何新世界系统.

MarsDoge commented 8 months ago

@KFERMercer Node0节点不支持内存不在位,必须保留内存,在此基础上请确认是否在配置界面是否已关闭legacymode模式引导社区版系统.

KFERMercer commented 8 months ago

@MarsDoge 感谢回应

Node0节点不支持内存不在位,必须保留内存

后续尝试只插 A2 (也归属于node0) 依然无法加载内核, 故障同描述中 A1 插有内存的情况.

在此基础上请确认是否在配置界面是否已关闭legacymode模式引导社区版系统.

此版本固件无 legacymode 配置选项.

MarsDoge commented 8 months ago

验证 archlinux-livecd-2023.11.29-loong64.iso 如下: 目前看似是最新的内核可能启用了某些配置,导致搭配旧一些的固件引导出现了calltrace,请补充一下您内核打印的完整log. @KFERMercer 或者您可以是否方便尝试一下6.1的内核,感谢!