Open kitakar5525 opened 4 years ago
I tried my older release chromeos-kernel-linux-surface-5.4.14
and I can reproduce this issue there🤔
Maybe some userspace changes triggered this issue (?)
For those affected by this issue, I recommend using v4.19 kernel for now.
Looks very similar to this: https://github.com/torvalds/linux/commit/e658c82be5561412c5e83b5e74e9da4830593f3e but should be fixed now...
For now, I gave up figuring out what's happening.
If you still want to use v5.4 series but affected by this issue, try the following:
I guess everyone uses swtpm. So, just blacklist tpm_tis
module for now. You can do so by adding module_blacklist=tpm_tis
to bootloader.
I can reproduce a similar tpm_tis
module crashing by reloading the module on Arch Linux with 5.7.2-arch1-1-surface
.
So, this is upstream Linux kernel issue.
```bash
$ sudo modprobe -r tpm_tis
$ sudo modprobe tpm_tis
zsh: killed sudo modprobe tpm_tis
$ dmesg -xw
kern :info : [153894.342734] tpm_tis IFX0562:00: 2.0 TPM (device-id 0x1A, rev-id 16)
kern :alert : [153894.367899] BUG: unable to handle page fault for address: ffffad71405f0000
kern :alert : [153894.367905] #PF: supervisor read access in kernel mode
kern :alert : [153894.367907] #PF: error_code(0x0000) - not-present page
kern :info : [153894.367907] PGD 45d942067 P4D 45d942067 PUD 45d943067 PMD 45cedb067 PTE 0
kern :warn : [153894.367911] Oops: 0000 [#1] PREEMPT SMP PTI
kern :warn : [153894.367914] CPU: 3 PID: 240328 Comm: modprobe Tainted: G C OE 5.7.2-arch1-1-surface #1
kern :warn : [153894.367916] Hardware name: Microsoft Corporation Surface Book/Surface Book, BIOS 92.3192.768 03.24.2020
kern :warn : [153894.367922] RIP: 0010:memcpy_erms+0x6/0x10
kern :warn : [153894.367923] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1
I sometimes encounter that v5.4 kernels don't boot on SB1; it turned out that
tpm_tis
module causes kernel oops:dmesg log
``` # # sudo modprobe tpm_tis # kern :debug : [ 347.275763] acpi IFX0562:00: GPIO: looking up 0 in _CRS kern :info : [ 347.281797] tpm_tis IFX0562:00: 2.0 TPM (device-id 0x1A, rev-id 16) kern :alert : [ 347.302914] BUG: unable to handle page fault for address: ffffb14b40372000 kern :alert : [ 347.302930] #PF: supervisor read access in kernel mode kern :alert : [ 347.302937] #PF: error_code(0x0000) - not-present page kern :info : [ 347.302944] PGD 45d561067 P4D 45d561067 PUD 45d564067 PMD 45cdcc067 PTE 0 kern :warn : [ 347.302959] Oops: 0000 [#1] PREEMPT SMP PTI kern :warn : [ 347.302970] CPU: 1 PID: 8555 Comm: modprobe Tainted: G C 5.4.43-04966-g44ecd9e48613 #21 kern :warn : [ 347.302980] Hardware name: Microsoft Corporation Surface Book/Surface Book, BIOS 92.3192.768 03.24.2020 kern :warn : [ 347.302998] RIP: 0010:tpm_read_log_efi+0x17c/0x1b8 [tpm] kern :warn : [ 347.303008] Code: e8 d8 50 a4 c3 eb 41 48 89 83 68 07 00 00 41 8b 55 04 4c 01 f0 41 bc 02 00 00 00 48 89 c7 48 63 0d 02 db 31 c5 48 8d 74 15 10 a4 48 63 05 f4 da 31 c5 49 01 c6 4c 03 b3 68 07 00 00 4c 89 b3
kern :warn : [ 347.303025] RSP: 0018:ffffb14b6a813b00 EFLAGS: 00010286
kern :warn : [ 347.303033] RAX: ffff9c79aa0ec7a9 RBX: ffff9c789fd99000 RCX: fffffffffffff0a1
kern :warn : [ 347.303042] RDX: 0000000000000a19 RSI: ffffb14b40372000 RDI: ffff9c79aa0ecd80
kern :warn : [ 347.303050] RBP: ffffb14b40371000 R08: 0000000000000000 R09: 8000000000000063
kern :warn : [ 347.303059] R10: 000ffffffffff000 R11: 0000000000000001 R12: 0000000000000002
kern :warn : [ 347.303067] R13: ffff9c7904d98018 R14: 00000000000047a9 R15: ffff9c7c311f2328
kern :warn : [ 347.303076] FS: 00007ead3aa90740(0000) GS:ffff9c7cdf280000(0000) knlGS:0000000000000000
kern :warn : [ 347.303086] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kern :warn : [ 347.303093] CR2: ffffb14b40372000 CR3: 0000000102a04004 CR4: 00000000003606e0
kern :warn : [ 347.303101] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kern :warn : [ 347.303110] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kern :warn : [ 347.303117] Call Trace:
kern :warn : [ 347.303133] tpm_bios_log_setup+0x67/0x16e [tpm]
kern :warn : [ 347.303146] tpm_chip_register+0x8d/0x248 [tpm]
kern :warn : [ 347.303159] tpm_tis_core_init+0x5b5/0x601 [tpm_tis_core]
kern :warn : [ 347.303176] tpm_tis_plat_probe+0xa0/0xbe [tpm_tis]
kern :warn : [ 347.303191] platform_drv_probe+0x44/0x85
kern :warn : [ 347.303203] really_probe+0x1bb/0x3d4
kern :warn : [ 347.303214] driver_probe_device+0xd5/0x10a
kern :warn : [ 347.303226] device_driver_attach+0x3c/0x55
kern :warn : [ 347.303236] __driver_attach+0x110/0x119
kern :warn : [ 347.303246] ? device_driver_attach+0x55/0x55
kern :warn : [ 347.303255] bus_for_each_dev+0x73/0xa9
kern :warn : [ 347.303266] bus_add_driver+0x12c/0x1de
kern :warn : [ 347.303276] driver_register+0x9e/0xd7
kern :warn : [ 347.303285] ? 0xffffffffc0856000
kern :warn : [ 347.303296] init_tis+0x8d/0x1000 [tpm_tis]
kern :warn : [ 347.303309] ? kernel_read+0x59/0x66
kern :warn : [ 347.303320] ? ___might_sleep+0x47/0x146
kern :warn : [ 347.303334] do_one_initcall+0xa4/0x1c9
kern :warn : [ 347.303346] ? slab_pre_alloc_hook+0x31/0x43
kern :warn : [ 347.303356] ? kmem_cache_alloc_trace+0xf4/0x106
kern :warn : [ 347.303369] do_init_module+0x5b/0x204
kern :warn : [ 347.303380] __do_sys_finit_module+0xb4/0xdb
kern :warn : [ 347.303391] do_syscall_64+0x4b/0x59
kern :warn : [ 347.303401] entry_SYSCALL_64_after_hwframe+0x44/0xa9
kern :warn : [ 347.303410] RIP: 0033:0x7ead3a5d3199
kern :warn : [ 347.303419] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9f 9c 2b 00 f7 d8 64 89 01 48
kern :warn : [ 347.303435] RSP: 002b:00007fff1e6a0488 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
kern :warn : [ 347.303446] RAX: ffffffffffffffda RBX: 000057863ff80cb0 RCX: 00007ead3a5d3199
kern :warn : [ 347.303454] RDX: 0000000000000000 RSI: 000057863e7ece58 RDI: 0000000000000003
kern :warn : [ 347.303461] RBP: 00007fff1e6a04d0 R08: 0000000000000000 R09: 000057863ff804b0
kern :warn : [ 347.303469] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
kern :warn : [ 347.303477] R13: 000057863ff80b80 R14: 000057863e7ece58 R15: 0000000000000000
kern :warn : [ 347.303487] Modules linked in: tpm_tis(+) nls_iso8859_1 nls_cp437 vfat fat snd_seq_dummy snd_seq snd_seq_device veth bridge stp llc tun nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp esp6 ah6 ip6t_REJECT ip6t_ipv6header cmac rfcomm uinput xt_MASQUERADE fuse usbhid iio_trig_sysfs hid_sensor_cros_compat hid_sensor_iio_common industrialio_triggered_buffer kfifo_buf industrialio surface_sam_sid_gpelid tpm_crb surfacepro3_button soc_button_array surface_sam_sid tpm_tis_core btusb btrtl btbcm btintel bluetooth ecdh_generic ecc lzo_rle lzo_compress snd_hda_codec_hdmi zram snd_hda_codec_realtek snd_soc_skl snd_hda_codec_generic joydev snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_hda_intel ipts snd_intel_dspcfg snd_hda_codec ipu3_cio2 ipu3_imgu(C) mwifiex_pcie snd_hwdep snd_hda_core v4l2_fwnode mwifiex mei_me videobuf2_dma_sg videobuf2_memops videobuf2_v4l2 videobuf2_common cfg80211 tpm_vtpm_proxy tablet_mode_switch tpm i915
kern :warn : [ 347.303603] CR2: ffffb14b40372000
kern :warn : [ 347.303611] ---[ end trace 5675b83dc3b769d4 ]---
kern :warn : [ 347.359703] RIP: 0010:tpm_read_log_efi+0x17c/0x1b8 [tpm]
kern :warn : [ 347.359708] Code: e8 d8 50 a4 c3 eb 41 48 89 83 68 07 00 00 41 8b 55 04 4c 01 f0 41 bc 02 00 00 00 48 89 c7 48 63 0d 02 db 31 c5 48 8d 74 15 10 a4 48 63 05 f4 da 31 c5 49 01 c6 4c 03 b3 68 07 00 00 4c 89 b3
kern :warn : [ 347.359713] RSP: 0018:ffffb14b6a813b00 EFLAGS: 00010286
kern :warn : [ 347.359716] RAX: ffff9c79aa0ec7a9 RBX: ffff9c789fd99000 RCX: fffffffffffff0a1
kern :warn : [ 347.359719] RDX: 0000000000000a19 RSI: ffffb14b40372000 RDI: ffff9c79aa0ecd80
kern :warn : [ 347.359721] RBP: ffffb14b40371000 R08: 0000000000000000 R09: 8000000000000063
kern :warn : [ 347.359724] R10: 000ffffffffff000 R11: 0000000000000001 R12: 0000000000000002
kern :warn : [ 347.359727] R13: ffff9c7904d98018 R14: 00000000000047a9 R15: ffff9c7c311f2328
kern :warn : [ 347.359730] FS: 00007ead3aa90740(0000) GS:ffff9c7cdf280000(0000) knlGS:0000000000000000
kern :warn : [ 347.359733] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kern :warn : [ 347.359735] CR2: ffffb14b40372000 CR3: 0000000102a04004 CR4: 00000000003606e0
kern :warn : [ 347.359738] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kern :warn : [ 347.359740] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kern :err : [ 347.359743] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1494
kern :err : [ 347.359747] in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 8555, name: modprobe
kern :warn : [ 347.359751] CPU: 1 PID: 8555 Comm: modprobe Tainted: G D C 5.4.43-04966-g44ecd9e48613 #21
kern :warn : [ 347.359754] Hardware name: Microsoft Corporation Surface Book/Surface Book, BIOS 92.3192.768 03.24.2020
kern :warn : [ 347.359757] Call Trace:
kern :warn : [ 347.359762] dump_stack+0x50/0x63
kern :warn : [ 347.359767] ___might_sleep+0x12f/0x146
kern :warn : [ 347.359772] down_read+0x1c/0x25
kern :warn : [ 347.359776] __blocking_notifier_call_chain+0x32/0x63
kern :warn : [ 347.359779] do_exit+0x37/0x9c2
kern :warn : [ 347.359783] rewind_stack_do_exit+0x17/0x20
kern :warn : [ 347.359786] RIP: 0033:0x7ead3a5d3199
kern :warn : [ 347.359789] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9f 9c 2b 00 f7 d8 64 89 01 48
kern :warn : [ 347.359794] RSP: 002b:00007fff1e6a0488 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
kern :warn : [ 347.359797] RAX: ffffffffffffffda RBX: 000057863ff80cb0 RCX: 00007ead3a5d3199
kern :warn : [ 347.359800] RDX: 0000000000000000 RSI: 000057863e7ece58 RDI: 0000000000000003
kern :warn : [ 347.359802] RBP: 00007fff1e6a04d0 R08: 0000000000000000 R09: 000057863ff804b0
kern :warn : [ 347.359805] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
kern :warn : [ 347.359807] R13: 000057863ff80b80 R14: 000057863e7ece58 R15: 0000000000000000
```
On normal chromeos kernel config, oops is treated as panic and when panic happened, chromeos reboots automatically by the following config: