Open zevweiss opened 3 years ago
Unfortunately, the platform code ignores the returned value of the remove handler.
Yeah, there's no failure path for the unbind operation...commit e5e1c209788138f33ca6558bf9f572f6904f486d acknowledges that the driver core ignores the remove operation's return value, but it doesn't sound like there's any movement toward that changing, and if anything sounds like things are going the opposite direction ("The right thing to do would be to make struct platform_driver::remove() return void."). I'm not sure what the reasoning is there; there doesn't appear to have been any further discussion at the time on the list: https://lore.kernel.org/all/20210207211537.19992-1-uwe@kleine-koenig.org/.
So it seems like the intended model is "remove() operations must succeed", but that leaves us in a bit of an awkward situation.
After reading through some code, it looks like this particular problem stems (more immediately) from the aspeed-smc driver allocating the struct spi_nor
via devm_kzalloc()
(as do all the other spi-nor controller drivers, incidentally). The unbind operation (because it can't be failed) goes through the all the devres cleanup hooks registered for the device and hence deallocates those allocations; the struct spi_nor
embeds the struct mtd_info
that the read/write paths in mtdchar.c access via file->private_data->mtd
, but at that point that pointer is of course stale.
Thinking out loud:
struct file
's pointer to the mtd_info
and check it before following it in the read/write path, but that just seems to turn it into a synchronization problem instead (if a read races with an unbind you could still easily end up with the same problem). The relevant synchronization point there appears to be mtd_table_mutex
(I think?), but acquiring that on every read/write operation just to check for the edge case of the driver having been unbound seems...ugly.struct file
still refers to it? That seems to weaken the semantics of the unbind operation though; I'd think the FD shouldn't just keep working after the driver's been unbound.I have another experimental Aspeed SMC driver using spimem
and it generates the same kind of issues.
[58555.806133] opening mtd FD...
[58555.808770] unbinding aspeed-smc...
[58555.810676] 8<--- cut here ---
[58555.810885] Unable to handle kernel NULL pointer dereference at virtual address 00000230
[58555.811257] pgd = f7697319
[58555.811422] [00000230] *pgd=00000000
[58555.811944] Internal error: Oops: 5 [#1] ARM
[58555.812225] Modules linked in:
[58555.812487] CPU: 0 PID: 1101 Comm: sh Not tainted 5.14.11-00170-gd9563a2fe5e9-dirty #309
[58555.812918] Hardware name: Generic DT based system
[58555.813205] PC is at klist_next+0x10/0xac
[58555.813420] LR is at device_for_each_child+0x3c/0x98
[58555.813665] pc : [<801f38f8>] lr : [<802407b0>] psr: 20000013
[58555.813953] sp : 85f03e60 ip : 80695e84 fp : 00000000
[58555.814195] r10: 00000000 r9 : 85f02000 r8 : 8499df10
[58555.814452] r7 : 00000000 r6 : 00000000 r5 : 85f03e7c r4 : 85407520
[58555.814749] r3 : 00000000 r2 : 80277e4c r1 : 85f03e7c r0 : 00000200
[58555.815094] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[58555.815387] Control: 00093177 Table: 461f0000 DAC: 00000051
[58555.815673] Register r0 information: non-paged memory
[58555.816024] Register r1 information: non-slab/vmalloc memory
[58555.816362] Register r2 information: non-slab/vmalloc memory
[58555.816629] Register r3 information: NULL pointer
[58555.816884] Register r4 information: slab kmalloc-2k start 85407000 pointer offset 1312 size 2048
[58555.817656] Register r5 information: non-slab/vmalloc memory
[58555.817926] Register r6 information: NULL pointer
[58555.818216] Register r7 information: NULL pointer
[58555.818507] Register r8 information: slab kmalloc-192 start 8499df00 pointer offset 16 size 192
[58555.819059] Register r9 information: non-slab/vmalloc memory
[58555.819429] Register r10 information: NULL pointer
[58555.819665] Register r11 information: NULL pointer
[58555.819926] Register r12 information: non-slab/vmalloc memory
[58555.820288] Process sh (pid: 1101, stack limit = 0x60a248fb)
[58555.820632] Stack: (0x85f03e60 to 0x85f04000)
[58555.821108] 3e60: 85407520 00000000 80277e4c 00000000 8499df10 802407b0 85407520 00000200
[58555.821531] 3e80: 00000000 8064ba28 85407520 80daf810 00000000 802769d0 00000000 80daf810
[58555.822018] 3ea0: 80677dd0 8027b7cc 80daf410 80248bf0 80daf410 802473c0 0000000d 80daf410
[58555.822436] 3ec0: 80677dd0 806756cc 8499df10 80245608 0000000d 84ac0660 8499df00 85f03f30
[58555.822886] 3ee0: 8499df10 801681b4 00000000 00000000 00000000 845dc6e0 0000000d 85f03f78
[58555.823292] 3f00: 014bc4e8 800fab20 0000000d 000001b5 014bc4e8 0000000d 00000100 00000000
[58555.823694] 3f20: 00000000 85f03f18 00000000 00000000 845dc6e0 00000000 00000000 00000000
[58555.824137] 3f40: 00000000 00000000 00000000 00000000 00000000 00000000 845dca94 8064ba28
[58555.824584] 3f60: 845dc6e0 014bc4e8 85f03f78 85f03f84 0000000d 800facb8 00000000 00000000
[58555.825032] 3f80: 845dca40 845dc6e0 00000000 8064ba28 00000001 76f76080 76f01b40 00000004
[58555.825481] 3fa0: 80008644 80008460 00000001 76f76080 00000001 014bc4e8 0000000d 00000000
[58555.825930] 3fc0: 00000001 76f76080 76f01b40 00000004 76f020f8 76f01c60 0055a850 00000000
[58555.826326] 3fe0: 0054e720 7eb0a9d8 76e807f0 76e8080c 60000010 00000001 00000000 00000000
[58555.826813] [<801f38f8>] (klist_next) from [<802407b0>] (device_for_each_child+0x3c/0x98)
[58555.827191] [<802407b0>] (device_for_each_child) from [<802769d0>] (spi_unregister_controller+0x1c/0xf4)
[58555.827560] [<802769d0>] (spi_unregister_controller) from [<8027b7cc>] (aspeed_smc_remove+0x10/0x34)
[58555.827960] [<8027b7cc>] (aspeed_smc_remove) from [<80248bf0>] (platform_remove+0x20/0x4c)
[58555.828322] [<80248bf0>] (platform_remove) from [<802473c0>] (device_release_driver_internal+0xb8/0x1a8)
[58555.828734] [<802473c0>] (device_release_driver_internal) from [<80245608>] (unbind_store+0x44/0x68)
[58555.829130] [<80245608>] (unbind_store) from [<801681b4>] (kernfs_fop_write_iter+0xe4/0x194)
[58555.829507] [<801681b4>] (kernfs_fop_write_iter) from [<800fab20>] (vfs_write+0x14c/0x1a8)
[58555.829878] [<800fab20>] (vfs_write) from [<800facb8>] (ksys_write+0x74/0xc4)
[58555.830198] [<800facb8>] (ksys_write) from [<80008460>] (ret_fast_syscall+0x0/0x54)
[58555.830597] Exception stack(0x85f03fa8 to 0x85f03ff0)
[58555.830925] 3fa0: 00000001 76f76080 00000001 014bc4e8 0000000d 00000000
[58555.831322] 3fc0: 00000001 76f76080 76f01b40 00000004 76f020f8 76f01c60 0055a850 00000000
[58555.831729] 3fe0: 0054e720 7eb0a9d8 76e807f0 76e8080c
[58555.832332] Code: e92d41f0 e1a05000 e5900000 e5956004 (e5907030)
[58555.833267] ---[ end trace 1c6243d4aa66fcdc ]---
Interesting...though it looks like in that case the panic happened during the unbind, instead of on an FD access after it?
The following script triggers a kernel panic:
dmesg output:
I'm looking into possible fixes, but figured I'd file it here in case anyone else has any input (and so it doesn't get lost/forgotten).