amazonlinux / amazon-linux-2023

Amazon Linux 2023
https://aws.amazon.com/linux/amazon-linux-2023/
Other
522 stars 39 forks source link

[Bug] kernel 6.1 NULL pointer dereference #431

Open 0xba472d opened 1 year ago

0xba472d commented 1 year ago

Describe the bug

instance restart, kernel null pointer dereference

Info:

disk mount tree:

NAME                               MAJ:MIN RM    SIZE RO TYPE MOUNTPOINTS
nvme1n1                            259:0    0      5T  0 disk
└─nvme1n1p1                        259:7    0      5T  0 part
  └─storage-data_corig             253:2    0      5T  0 lvm
    └─storage-data                 253:3    0      5T  0 lvm  /data
nvme2n1                            259:1    0      1T  0 disk
└─nvme2n1p1                        259:3    0   1024G  0 part
  ├─storage-data_cache_cpool_cdata 253:0    0 1023.8G  0 lvm
  │ └─storage-data                 253:3    0      5T  0 lvm  /data
  └─storage-data_cache_cpool_cmeta 253:1    0     92M  0 lvm
    └─storage-data                 253:3    0      5T  0 lvm  /data
nvme0n1                            259:2    0     20G  0 disk
├─nvme0n1p1                        259:4    0     20G  0 part /
├─nvme0n1p127                      259:5    0      1M  0 part
└─nvme0n1p128                      259:6    0     10M  0 part

xfs info:

meta-data=/dev/mapper/storage-data isize=512    agcount=40, agsize=33554432 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1
data     =                       bsize=4096   blocks=1342176256, imaxpct=5
         =                       sunit=512    swidth=512 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

To Reproduce

  1. run geth, sync ethereum mainnet blockchain.

Additional context

Console output:


Amazon Linux 2023
Kernel 6.1.41-63.114.amzn2023.x86_64 on an x86_64 (-)

[189819.791492] BUG: kernel NULL pointer dereference, address: 0000000000000082
[189819.792315] #PF: supervisor read access in kernel mode
[189819.792909] #PF: error_code(0x0000) - not-present page
[189819.793559] PGD 1314dd067 P4D 1314dd067 PUD 124a6f067 PMD 0 
[189819.794232] Oops: 0000 [#1] PREEMPT SMP NOPTI
[189819.794743] CPU: 2 PID: 98840 Comm: geth Not tainted 6.1.41-63.114.amzn2023.x86_64 #1
[189819.795624] Hardware name: Amazon EC2 m7i.2xlarge/, BIOS 1.0 10/16/2017
[189819.796370] RIP: 0010:next_uptodate_page+0x45/0x1f0
[189819.796944] Code: 0f 84 4b 01 00 00 48 81 ff 06 04 00 00 0f 84 bf 00 00 00 48 81 ff 02 04 00 00 0f 84 42 01 00 00 40 f6 c7 01 0f 85 a8 00 00 00 <48> 8b 07 a8 01 0f 85 9d 00 00 00 8b 47 34 85 c0 0f 84 92 00 00 00
[189819.799027] RSP: 0000:ffffb322cee83ca8 EFLAGS: 00010246
[189819.799665] RAX: 0000000000000082 RBX: ffffb322cee83cf8 RCX: 0000000000003bbe
[189819.800587] RDX: ffffb322cee83cf8 RSI: ffff8941ad7276b0 RDI: 0000000000000082
[189819.801492] RBP: ffff8941ad7276b0 R08: 0000000000000402 R09: 0000000000003bbe
[189819.802419] R10: ffff8941ae70d950 R11: 0000000000003baf R12: 0000000000003bbe
[189819.803421] R13: 0000000000003baf R14: 0000000000003baf R15: ffff8941ad7276b0
[189819.804251] FS:  00007f833a0d6700(0000) GS:ffff89487e480000(0000) knlGS:0000000000000000
[189819.805158] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[189819.805812] CR2: 0000000000000082 CR3: 00000001acbbe005 CR4: 0000000000770ee0
[189819.806615] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[189819.807415] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[189819.808215] PKRU: 55555554
[189819.808546] Call Trace:
[189819.808862]  <TASK>
[189819.809137]  ? show_trace_log_lvl+0x1c4/0x2d2
[189819.809657]  ? show_trace_log_lvl+0x1c4/0x2d2
[189819.810184]  ? filemap_map_pages+0xad/0x4a0
[189819.810680]  ? __die_body.cold+0x8/0xd
[189819.811127]  ? page_fault_oops+0xac/0x150
[189819.811606]  ? do_user_addr_fault+0x61/0x5a0
[189819.812111]  ? kvm_read_and_reset_apf_flags+0x45/0x60
[189819.812694]  ? exc_page_fault+0x62/0x140
[189819.813157]  ? asm_exc_page_fault+0x22/0x30
[189819.813652]  ? next_uptodate_page+0x45/0x1f0
[189819.814153]  ? do_set_pte+0x106/0x220
[189819.814590]  filemap_map_pages+0xad/0x4a0
[189819.815058]  xfs_filemap_map_pages+0x42/0x70
[189819.815564]  do_read_fault+0xd8/0x190
[189819.816023]  do_fault+0xbe/0x4a0
[189819.816419]  __handle_mm_fault+0x513/0x5e0
[189819.816907]  handle_mm_fault+0xc5/0x2b0
[189819.817368]  do_user_addr_fault+0x1af/0x5a0
[189819.817861]  exc_page_fault+0x62/0x140
[189819.818373]  asm_exc_page_fault+0x22/0x30
[189819.818952] RIP: 0033:0x4240ef
[189819.819441] Code: 24 58 48 83 c4 60 c3 49 ff c1 d1 ea 49 83 c0 08 49 83 f9 08 7d bd 0f 1f 44 00 00 4c 39 c3 76 b3 0f ba e2 00 73 e1 4d 8d 14 00 <4d> 8b 12 4d 85 d2 74 d5 4c 89 4c 24 40 4c 89 54 24 38 4c 89 44 24
[189819.821487] RSP: 002b:00007f833a0d6078 EFLAGS: 00010207
[189819.822093] RAX: 0000000003fb0140 RBX: 0000000000040000 RCX: 00007f83ce152d78
[189819.822896] RDX: 0000000000000055 RSI: 0000000000000000 RDI: 000000c000063c40
[189819.823698] RBP: 00007f833a0d60d0 R08: 0000000000000000 R09: 0000000000000000
[189819.824505] R10: 0000000003fb0140 R11: 0000000000000979 R12: 00007f833a0d61a0
[189819.825304] R13: 0000000000000000 R14: 000000c0130a0000 R15: 000000000001e430
[189819.826089]  </TASK>
[189819.826366] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel nfs curve25519_x86_64 libcurve25519_generic libchacha lockd grace fscache sunrpc dm_cache_smq dm_cache dm_persistent_data dm_bio_prison dm_bufio ghash_clmulni_intel aesni_intel crypto_simd cryptd ena button tcp_bbr sch_fq_codel drm i2c_core drm_panel_orientation_quirks fuse backlight configfs dmi_sysfs crc32_pclmul crc32c_intel dm_mirror dm_region_hash dm_log dm_mod dax efivarfs
[189819.831513] CR2: 0000000000000082
[189819.831912] ---[ end trace 0000000000000000 ]---
[189819.859458] RIP: 0010:next_uptodate_page+0x45/0x1f0
[189819.860069] Code: 0f 84 4b 01 00 00 48 81 ff 06 04 00 00 0f 84 bf 00 00 00 48 81 ff 02 04 00 00 0f 84 42 01 00 00 40 f6 c7 01 0f 85 a8 00 00 00 <48> 8b 07 a8 01 0f 85 9d 00 00 00 8b 47 34 85 c0 0f 84 92 00 00 00
[189819.862117] RSP: 0000:ffffb322cee83ca8 EFLAGS: 00010246
[189819.862722] RAX: 0000000000000082 RBX: ffffb322cee83cf8 RCX: 0000000000003bbe
[189819.863535] RDX: ffffb322cee83cf8 RSI: ffff8941ad7276b0 RDI: 0000000000000082
[189819.864348] RBP: ffff8941ad7276b0 R08: 0000000000000402 R09: 0000000000003bbe
[189819.865179] R10: ffff8941ae70d950 R11: 0000000000003baf R12: 0000000000003bbe
[189819.865993] R13: 0000000000003baf R14: 0000000000003baf R15: ffff8941ad7276b0
[189819.866802] FS:  00007f833a0d6700(0000) GS:ffff89487e480000(0000) knlGS:0000000000000000
[189819.867709] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[189819.868373] CR2: 0000000000000082 CR3: 00000001acbbe005 CR4: 0000000000770ee0
[189819.869191] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[189819.870001] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[189819.870811] PKRU: 55555554
[189819.871143] Kernel panic - not syncing: Fatal exception
[189819.872379] Kernel Offset: 0x28000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
stewartsmith commented 1 year ago

Thanks for the bug report!

It has caught the attention of our kernel team. I don't have a timeline for any further information/questions/fix but wanted to ack that we have seen the report.