foss-for-synopsys-dwc-arc-processors / linux

Helpful resources for users & developers of Linux kernel for ARC
22 stars 13 forks source link

[ARCHS] Linux 6.3 kernel does not boot in SMP configuration #135

Closed pavelvkozlov closed 8 months ago

pavelvkozlov commented 1 year ago

Linux kernel version from 6.3 doesn't boot in SMP configuration because of assert.

Linux version 6.3.0 (pvk@SNPS-o0WHuHJU73) (arc-buildroot-linux-gnu-gcc.br_real (Buildroot 2021.11-5735-gc96213909348) 12.2.1 20220829, GNU ld (GNU Binutils) 2.38.50.20220215) #2 SMP PREEMPT Thu Jun  1 19:33:20 +04 2023
Memory @ 80000000 [512M]
OF: fdt: Machine model: snps,zebu_hs-smp
earlycon: uart8250 at MMIO32 0xf0000000 (options '115200n8')
printk: bootconsole [uart8250] enabled
Failed to get possible-cpus from dtb, pretending all 4 cpus exist
archs-intc      : 15 priority levels (default 1)

IDENTITY        : ARCVER [0x50] ARCNUM [0x0] CHIPID [ 0x0]
processor [0]   : Unknown Unknown (ARCv2 ISA)
Timers          : Timer0 Timer1 RTC [UP 64-bit] GFRC [SMP 64-bit]
ISA Extn        : atomic ll64 unalign mpy[opt 9] div_rem
BPU             : partial match, cache:2048, Predict Table:16384 Return stk: 8
MMU [v4]        : 8k PAGE, , swalk 2 lvl, JTLB 512 (128x4), uDTLB 8, uITLB 4
I-Cache         : 16K, 2way/set, 64B Line, VIPT
D-Cache         : 16K, 4way/set, 64B Line, PIPT
Peripherals     : 0xc0000000
Vector Table    : 0x80000000
DEBUG           : ActionPoint 4/full
Extn [SMP]      : ARConnect (v2): 4 cores with IPI IDU GFRC
Zone ranges:
  Normal   [mem 0x0000000080000000-0x000000009fffffff]
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000080000000-0x000000009fffffff]
Initmem setup node 0 [mem 0x0000000080000000-0x000000009fffffff]
percpu: BUG: failure at mm/percpu.c:2981/pcpu_build_alloc_info()!

gcc generated __builtin_trap
Path: (null)
CPU: 0 PID: 0 Comm: swapper Not tainted 6.3.0 #2
gcc generated __builtin_trap
ECR: 0x00090005 EFA: 0x80837bd4 ERET: 0x80837bd6
STAT: 0x00080802 [  K     ]   BTA: 0x80be5324
 SP: 0x80cfff88  FP: 0x80dde4c8 BLK: pcpu_embed_first_chunk+0x378/0x5ac
LPS: 0x80be2d8c LPE: 0x80be2d94 LPC: 0x00000000
r00: 0x00000041 r01: 0x80d85924 r02: 0x00000000
r03: 0x80cffee0 r04: 0x00000000 r05: 0x00000000
r06: 0x3a632e75 r07: 0x31383932 r08: 0x7063702f
r09: 0x75625f75 r10: 0x5f646c69 r11: 0x6f6c6c61
r12: 0x00000000 r13: 0x9fd94000 r14: 0x9fd94038
r15: 0x00002000 r16: 0x9fd94020 r17: 0x00002000
r18: 0x0000c300 r19: 0x00000001 r20: 0x00000005
r21: 0x00002000 r22: 0x00014000 r23: 0x00000001
r24: 0x80846fdc r25: 0x80d1a580

Stack Trace:
  pcpu_embed_first_chunk+0x37a/0x5ac

After commit with cpumask optimization https://github.com/torvalds/linux/commit/596ff4a09b8981790e15572e8e7bc904df5835e7 initialization process for per-cpu allocator stops with assert. I've reproduced this issue on HSDK board and on nSIM. After the patch in the pcpu_build_alloc_info() function I see broken loop, that fills group_cnt array incorrectly and as a result assert fails. I guess, the reason may not be in the patch itself, but in compiler optimizations for the loop. Furthered analysis is required.

pavelvkozlov commented 8 months ago

Fixed by: https://github.com/torvalds/linux/commit/42f51fb24fd39cc547c086ab3d8a314cc603a91c Use the latest stable 6.5.6 or later kernels that have this fix, Kernels 6.3 and 6.4 may contain described issue.