Open 0xba472d opened 1 year ago
Describe the bug
instance restart, kernel null pointer dereference
Info:
defaults,noatime,nodiratime,allocsize=256m,logbufs=8,logbsize=256k
ro console=tty0 console=ttyS0,115200n8 nvme_core.io_timeout=4294967295 rd.emergency=poweroff rd.shell=0 security=selinux quiet selinux=0 preempt=full processor.max_cstate=1 intel_idle.max_cstate=1
disk mount tree:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS nvme1n1 259:0 0 5T 0 disk └─nvme1n1p1 259:7 0 5T 0 part └─storage-data_corig 253:2 0 5T 0 lvm └─storage-data 253:3 0 5T 0 lvm /data nvme2n1 259:1 0 1T 0 disk └─nvme2n1p1 259:3 0 1024G 0 part ├─storage-data_cache_cpool_cdata 253:0 0 1023.8G 0 lvm │ └─storage-data 253:3 0 5T 0 lvm /data └─storage-data_cache_cpool_cmeta 253:1 0 92M 0 lvm └─storage-data 253:3 0 5T 0 lvm /data nvme0n1 259:2 0 20G 0 disk ├─nvme0n1p1 259:4 0 20G 0 part / ├─nvme0n1p127 259:5 0 1M 0 part └─nvme0n1p128 259:6 0 10M 0 part
xfs info:
meta-data=/dev/mapper/storage-data isize=512 agcount=40, agsize=33554432 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 bigtime=1 inobtcount=1 data = bsize=4096 blocks=1342176256, imaxpct=5 = sunit=512 swidth=512 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=521728, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0
To Reproduce
Additional context
Console output:
Amazon Linux 2023 Kernel 6.1.41-63.114.amzn2023.x86_64 on an x86_64 (-) [189819.791492] BUG: kernel NULL pointer dereference, address: 0000000000000082 [189819.792315] #PF: supervisor read access in kernel mode [189819.792909] #PF: error_code(0x0000) - not-present page [189819.793559] PGD 1314dd067 P4D 1314dd067 PUD 124a6f067 PMD 0 [189819.794232] Oops: 0000 [#1] PREEMPT SMP NOPTI [189819.794743] CPU: 2 PID: 98840 Comm: geth Not tainted 6.1.41-63.114.amzn2023.x86_64 #1 [189819.795624] Hardware name: Amazon EC2 m7i.2xlarge/, BIOS 1.0 10/16/2017 [189819.796370] RIP: 0010:next_uptodate_page+0x45/0x1f0 [189819.796944] Code: 0f 84 4b 01 00 00 48 81 ff 06 04 00 00 0f 84 bf 00 00 00 48 81 ff 02 04 00 00 0f 84 42 01 00 00 40 f6 c7 01 0f 85 a8 00 00 00 <48> 8b 07 a8 01 0f 85 9d 00 00 00 8b 47 34 85 c0 0f 84 92 00 00 00 [189819.799027] RSP: 0000:ffffb322cee83ca8 EFLAGS: 00010246 [189819.799665] RAX: 0000000000000082 RBX: ffffb322cee83cf8 RCX: 0000000000003bbe [189819.800587] RDX: ffffb322cee83cf8 RSI: ffff8941ad7276b0 RDI: 0000000000000082 [189819.801492] RBP: ffff8941ad7276b0 R08: 0000000000000402 R09: 0000000000003bbe [189819.802419] R10: ffff8941ae70d950 R11: 0000000000003baf R12: 0000000000003bbe [189819.803421] R13: 0000000000003baf R14: 0000000000003baf R15: ffff8941ad7276b0 [189819.804251] FS: 00007f833a0d6700(0000) GS:ffff89487e480000(0000) knlGS:0000000000000000 [189819.805158] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [189819.805812] CR2: 0000000000000082 CR3: 00000001acbbe005 CR4: 0000000000770ee0 [189819.806615] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [189819.807415] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [189819.808215] PKRU: 55555554 [189819.808546] Call Trace: [189819.808862] <TASK> [189819.809137] ? show_trace_log_lvl+0x1c4/0x2d2 [189819.809657] ? show_trace_log_lvl+0x1c4/0x2d2 [189819.810184] ? filemap_map_pages+0xad/0x4a0 [189819.810680] ? __die_body.cold+0x8/0xd [189819.811127] ? page_fault_oops+0xac/0x150 [189819.811606] ? do_user_addr_fault+0x61/0x5a0 [189819.812111] ? kvm_read_and_reset_apf_flags+0x45/0x60 [189819.812694] ? exc_page_fault+0x62/0x140 [189819.813157] ? asm_exc_page_fault+0x22/0x30 [189819.813652] ? next_uptodate_page+0x45/0x1f0 [189819.814153] ? do_set_pte+0x106/0x220 [189819.814590] filemap_map_pages+0xad/0x4a0 [189819.815058] xfs_filemap_map_pages+0x42/0x70 [189819.815564] do_read_fault+0xd8/0x190 [189819.816023] do_fault+0xbe/0x4a0 [189819.816419] __handle_mm_fault+0x513/0x5e0 [189819.816907] handle_mm_fault+0xc5/0x2b0 [189819.817368] do_user_addr_fault+0x1af/0x5a0 [189819.817861] exc_page_fault+0x62/0x140 [189819.818373] asm_exc_page_fault+0x22/0x30 [189819.818952] RIP: 0033:0x4240ef [189819.819441] Code: 24 58 48 83 c4 60 c3 49 ff c1 d1 ea 49 83 c0 08 49 83 f9 08 7d bd 0f 1f 44 00 00 4c 39 c3 76 b3 0f ba e2 00 73 e1 4d 8d 14 00 <4d> 8b 12 4d 85 d2 74 d5 4c 89 4c 24 40 4c 89 54 24 38 4c 89 44 24 [189819.821487] RSP: 002b:00007f833a0d6078 EFLAGS: 00010207 [189819.822093] RAX: 0000000003fb0140 RBX: 0000000000040000 RCX: 00007f83ce152d78 [189819.822896] RDX: 0000000000000055 RSI: 0000000000000000 RDI: 000000c000063c40 [189819.823698] RBP: 00007f833a0d60d0 R08: 0000000000000000 R09: 0000000000000000 [189819.824505] R10: 0000000003fb0140 R11: 0000000000000979 R12: 00007f833a0d61a0 [189819.825304] R13: 0000000000000000 R14: 000000c0130a0000 R15: 000000000001e430 [189819.826089] </TASK> [189819.826366] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel nfs curve25519_x86_64 libcurve25519_generic libchacha lockd grace fscache sunrpc dm_cache_smq dm_cache dm_persistent_data dm_bio_prison dm_bufio ghash_clmulni_intel aesni_intel crypto_simd cryptd ena button tcp_bbr sch_fq_codel drm i2c_core drm_panel_orientation_quirks fuse backlight configfs dmi_sysfs crc32_pclmul crc32c_intel dm_mirror dm_region_hash dm_log dm_mod dax efivarfs [189819.831513] CR2: 0000000000000082 [189819.831912] ---[ end trace 0000000000000000 ]--- [189819.859458] RIP: 0010:next_uptodate_page+0x45/0x1f0 [189819.860069] Code: 0f 84 4b 01 00 00 48 81 ff 06 04 00 00 0f 84 bf 00 00 00 48 81 ff 02 04 00 00 0f 84 42 01 00 00 40 f6 c7 01 0f 85 a8 00 00 00 <48> 8b 07 a8 01 0f 85 9d 00 00 00 8b 47 34 85 c0 0f 84 92 00 00 00 [189819.862117] RSP: 0000:ffffb322cee83ca8 EFLAGS: 00010246 [189819.862722] RAX: 0000000000000082 RBX: ffffb322cee83cf8 RCX: 0000000000003bbe [189819.863535] RDX: ffffb322cee83cf8 RSI: ffff8941ad7276b0 RDI: 0000000000000082 [189819.864348] RBP: ffff8941ad7276b0 R08: 0000000000000402 R09: 0000000000003bbe [189819.865179] R10: ffff8941ae70d950 R11: 0000000000003baf R12: 0000000000003bbe [189819.865993] R13: 0000000000003baf R14: 0000000000003baf R15: ffff8941ad7276b0 [189819.866802] FS: 00007f833a0d6700(0000) GS:ffff89487e480000(0000) knlGS:0000000000000000 [189819.867709] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [189819.868373] CR2: 0000000000000082 CR3: 00000001acbbe005 CR4: 0000000000770ee0 [189819.869191] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [189819.870001] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [189819.870811] PKRU: 55555554 [189819.871143] Kernel panic - not syncing: Fatal exception [189819.872379] Kernel Offset: 0x28000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Thanks for the bug report!
It has caught the attention of our kernel team. I don't have a timeline for any further information/questions/fix but wanted to ack that we have seen the report.
Describe the bug
instance restart, kernel null pointer dereference
Info:
defaults,noatime,nodiratime,allocsize=256m,logbufs=8,logbsize=256k
ro console=tty0 console=ttyS0,115200n8 nvme_core.io_timeout=4294967295 rd.emergency=poweroff rd.shell=0 security=selinux quiet selinux=0 preempt=full processor.max_cstate=1 intel_idle.max_cstate=1
disk mount tree:
xfs info:
To Reproduce
Additional context
Console output: