dentproject / dentOS

dentOS SwitchDev based NOS
Other
200 stars 59 forks source link

grep -rn / -e "remapping interface" is crashing the device #187

Open mgheorghe opened 1 year ago

mgheorghe commented 1 year ago

dent 3.0

i was looking for something using grep command on the whole filesystem grep -rn / -e "remapping interface" and unit crashed (i tried 2 times it crashed 2 times)

[  160.053979] CPU: 2 PID: 1959 Comm: grep Tainted: G           O      5.10.4 #1
[  160.053981] Hardware name: Accton Marvell Armada 7040 board setup (DT)
[  160.053983] pstate: 20000085 (nzCv daIf -PAN -UAO -TCO BTYPE=--)
[  160.053984] pc : regmap_mmio_read32le+0x28/0x48
[  160.053985] lr : regmap_mmio_read+0x48/0x70
[  160.053986] sp : ffff80001331bb90
[  160.053988] x29: ffff80001331bb90 x28: ffff000101a41000
[  160.053992] x27: 0000000000000700 x26: ffff8000119be000
[  160.053996] x25: ffff8000119be558 x24: 0000000000000000
[  160.053999] x23: ffff80001331bcfc x22: ffff80001331be00
[  160.054002] x21: ffff80001331bcfc x20: ffff000100e96200
[  160.054005] x19: 0000000000000700 x18: 0000000000000004
[  160.054008] x17: 0000000000000000 x16: 0000000000000000
[  160.054011] x15: ffff8000124e9988 x14: ffff0001072d0000
[  160.054014] x13: ffff0001072c9a44 x12: 0000000000000000
[  160.054017] x11: 0000000000000000 x10: 0000000000000000
[  160.054020] x9 : ffff80001331bc80 x8 : 000000000000000f
[  160.054023] x7 : ffff80001331bb30 x6 : ffff0001072c9a46
[  160.054026] x5 : 0000000000000012 x4 : 0000000000000002
[  160.054029] x3 : ffff8000109f88e8 x2 : ffff8000109f86b8
[  160.054032] x1 : 0000000000000700 x0 : 0000000000000000
[  160.054036] Kernel panic - not syncing: Asynchronous SError Interrupt
[  160.054038] CPU: 2 PID: 1959 Comm: grep Tainted: G           O      5.10.4 #1
[  160.054039] Hardware name: Accton Marvell Armada 7040 board setup (DT)
[  160.054041] Call trace:
[  160.054042]  dump_backtrace+0x0/0x200
[  160.054043]  show_stack+0x2c/0x80
[  160.054044]  dump_stack+0xd0/0x128
[  160.054045]  panic+0x184/0x3a8
[  160.054047]  nmi_panic+0x9c/0xa0
[  160.054048]  arm64_serror_panic+0x84/0x90
[  160.054049]  do_serror+0x34/0x90
[  160.054050]  el1_error+0x84/0x104
[  160.054052]  regmap_mmio_read32le+0x28/0x48
[  160.054053]  regmap_mmio_read+0x48/0x70
[  160.054054]  _regmap_bus_reg_read+0x38/0x48
[  160.054055]  _regmap_read+0x6c/0x1b8
[  160.054057]  regmap_read+0x50/0x78
[  160.054058]  regmap_read_debugfs+0x10c/0x340
[  160.054059]  regmap_map_read_file+0x48/0x58
[  160.054060]  full_proxy_read+0x68/0x98
[  160.054062]  vfs_read+0xac/0x1b8
[  160.054063]  ksys_read+0x74/0x100
[  160.054064]  __arm64_sys_read+0x24/0x30
[  160.054065]  el0_svc_common.constprop.3+0x74/0x170
[  160.054067]  do_el0_svc+0x34/0xc0
[  160.054068]  el0_svc+0x1c/0x28
[  160.054069]  el0_sync_handler+0x8c/0xb0
[  160.054070]  el0_sync+0x140/0x180
[  160.054087] SMP: stopping secondary CPUs
[  160.054088] Kernel Offset: 0x80000 from 0xffff800010000000
[  160.054089] PHYS_OFFSET: 0x0
[  160.054090] CPU features: 0x0240022,61806000
[  160.054092] Memory Limit: none
* Open Network Linux Loader
*
*        Version: DENTOS-HEAD
*             Id: 2022-12-14.04:51-c9fc890
*
*       Platform: arm64-accton-as4224-52p-r0
paulmenzel commented 1 year ago

If you run that under strace -e file on what file access does it start to hang?

KanjiMonster commented 11 months ago

After a bit more investigation, the "offending" files are

/sys/kernel/debug/regmap/dummy-system-controller@f06f4000/registers
/sys/kernel/debug/regmap/dummy-system-controller@f2440000/registers

When trying to read these, eventually the system will lock up. It looks like these register ranges cover offsets that cause errors when trying to read them.

Interestingly, the other two regmaps

/sys/kernel/debug/regmap/dummy-system-controller@f06f8000/registers
/sys/kernel/debug/regmap/dummy-system-controller@f2400000/registers

are fine.

So lacking an easy way to set up holes in regmap via dts, lacking documentation about these register ranges, and the fact that you can only access this as root, my suggestion for now is "avoid accessing those files".