netty / netty-incubator-transport-io_uring

Apache License 2.0
195 stars 39 forks source link

Kernel warning and instability in 5.10 #166

Closed yuzawa-san closed 2 years ago

yuzawa-san commented 2 years ago

I am seeing some dodgy behavior in 5.10. We did upgrade to 5.15 and that seemed to fix things, but leaving this here for posterity in case anybody else runs into the same issue. The IOUringEventLoopGroup appears to start and then fail to handle any tasks. The Java application appears to continue to run without exceptions or failures. Successive runs cause the kernel to seize up and for the networking to be come unresponsive in a cloud environment (AWS) rendering the instance wholly unusable. There also appeared to be some occasional page faults in there as well. Looks like some sort of internal kernel data structures are getting corrupted. Is this just a buggy kernel version? Should this project increase the minimum supported kernel?

uname -a (this is Amazon Linux 2)

Linux ip-10-204-40-54.ec2.internal 5.10.118-111.515.amzn2.x86_64 #1 SMP Wed May 25 22:12:19 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

/var/log/messages snippet:

Aug 15 19:56:11 ip-10-204-40-54 kernel: ------------[ cut here ]------------
Aug 15 19:56:11 ip-10-204-40-54 kernel: WARNING: CPU: 32 PID: 33572 at lib/iov_iter.c:1095 iov_iter_revert+0xb2/0x1d0
Aug 15 19:56:11 ip-10-204-40-54 kernel: Modules linked in: falcon_lsm_serviceable(PE) falcon_nf_netcontain(PE) falcon_kal(E) falcon_lsm_pinned_13804(E) binfmt_misc sunrpc dm_mirror dm_region_hash dm_log dm_mod dax crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper mousedev psmouse button ena crc32c_intel
Aug 15 19:56:11 ip-10-204-40-54 kernel: CPU: 32 PID: 33572 Comm: IOUringEventLoo Tainted: P      D W   E     5.10.118-111.515.amzn2.x86_64 #1
Aug 15 19:56:11 ip-10-204-40-54 kernel: Hardware name: Amazon EC2 c5.9xlarge/, BIOS 1.0 10/16/2017
Aug 15 19:56:11 ip-10-204-40-54 kernel: RIP: 0010:iov_iter_revert+0xb2/0x1d0
Aug 15 19:56:11 ip-10-204-40-54 kernel: Code: 4c 8d 40 01 48 83 c0 02 4c 89 47 20 48 39 ce 76 d6 48 83 ea 10 48 29 ce 8b 4a 08 48 89 47 20 48 83 c0 01 48 39 f1 72 e9 eb bd <0f> 0b c3 41 55 41 54 55 53 4c 8b 67 18 8b 5f 20 48 8b 57 08 41 8b
Aug 15 19:56:11 ip-10-204-40-54 kernel: RSP: 0018:ffffb9510874bbe0 EFLAGS: 00010202
Aug 15 19:56:11 ip-10-204-40-54 kernel: RAX: ffff8bb277fdbb88 RBX: 0000000000000008 RCX: 0000000000000000
Aug 15 19:56:11 ip-10-204-40-54 kernel: RDX: 0000000000000001 RSI: 0000744d88024480 RDI: ffffb9510874bc00
Aug 15 19:56:11 ip-10-204-40-54 kernel: RBP: ffff8ba1c6a7b600 R08: fffffffffffffff5 R09: ffff8ba1c456a500
Aug 15 19:56:11 ip-10-204-40-54 kernel: R10: 00007f71752c54b0 R11: 0000000000000000 R12: 0000000000000000
Aug 15 19:56:11 ip-10-204-40-54 kernel: R13: ffffb9510874bc00 R14: 0000000000000001 R15: ffffb9510874bc30
Aug 15 19:56:11 ip-10-204-40-54 kernel: FS:  00007f70207ca700(0000) GS:ffff8bb231e00000(0000) knlGS:0000000000000000
Aug 15 19:56:11 ip-10-204-40-54 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 15 19:56:11 ip-10-204-40-54 kernel: CR2: 00007f70207b5a20 CR3: 0000000f40ebe002 CR4: 00000000007706e0
Aug 15 19:56:11 ip-10-204-40-54 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 15 19:56:11 ip-10-204-40-54 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 15 19:56:11 ip-10-204-40-54 kernel: PKRU: 55555554
Aug 15 19:56:11 ip-10-204-40-54 kernel: Call Trace:
Aug 15 19:56:11 ip-10-204-40-54 kernel: io_read+0x35e/0x380
Aug 15 19:56:11 ip-10-204-40-54 kernel: ? kernel_init_free_pages+0x46/0x60
Aug 15 19:56:11 ip-10-204-40-54 kernel: ? prep_new_page+0x6c/0x80
Aug 15 19:56:11 ip-10-204-40-54 kernel: ? xas_alloc+0x9b/0xc0
Aug 15 19:56:11 ip-10-204-40-54 kernel: ? get_page_from_freelist+0x2e2/0x340
Aug 15 19:56:11 ip-10-204-40-54 kernel: io_issue_sqe+0x53c/0x980
Aug 15 19:56:11 ip-10-204-40-54 kernel: ? io_req_prep+0x70c/0xd50
Aug 15 19:56:11 ip-10-204-40-54 kernel: ? xas_alloc+0x9b/0xc0
Aug 15 19:56:11 ip-10-204-40-54 kernel: __io_queue_sqe+0x88/0x200
Aug 15 19:56:11 ip-10-204-40-54 kernel: io_submit_sqe+0x1e8/0x280
Aug 15 19:56:11 ip-10-204-40-54 kernel: io_submit_sqes+0x1c6/0x5a0
Aug 15 19:56:11 ip-10-204-40-54 kernel: __do_sys_io_uring_enter+0x237/0x3a0
Aug 15 19:56:11 ip-10-204-40-54 kernel: do_syscall_64+0x33/0x40
Aug 15 19:56:11 ip-10-204-40-54 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Aug 15 19:56:11 ip-10-204-40-54 kernel: RIP: 0033:0x7f717ce692e9
Aug 15 19:56:11 ip-10-204-40-54 kernel: Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 67 cb 2b 00 f7 d8 64 89 01 48
Aug 15 19:56:11 ip-10-204-40-54 kernel: RSP: 002b:00007f70207c95c8 EFLAGS: 00000202 ORIG_RAX: 00000000000001aa
Aug 15 19:56:11 ip-10-204-40-54 kernel: RAX: ffffffffffffffda RBX: 00000000000000c6 RCX: 00007f717ce692e9
Aug 15 19:56:11 ip-10-204-40-54 kernel: RDX: 0000000000000001 RSI: 0000000000000002 RDI: 00000000000000c6
Aug 15 19:56:11 ip-10-204-40-54 kernel: RBP: 00007f70207c95e0 R08: 0000000000000000 R09: 0000000000000008
Aug 15 19:56:11 ip-10-204-40-54 kernel: R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000000001
Aug 15 19:56:11 ip-10-204-40-54 kernel: R13: 0000000000000002 R14: 0000000000000001 R15: 00007f7175b3d260
Aug 15 19:56:11 ip-10-204-40-54 kernel: ---[ end trace 310f2be91541cf09 ]---

Seems to be coming from https://elixir.bootlin.com/linux/v5.10.118/source/lib/iov_iter.c#L1095

Possible lead from https://man.archlinux.org/man/io_uring_setup.2.en

Before version 5.11 of the Linux kernel, to successfully use this feature, the application must register a set of files to be used for IO through io_uring_register(2) using the IORING_REGISTER_FILES opcode. Failure to do so will result in submitted IO being errored with EBADF. The presence of this feature can be detected by the IORING_FEAT_SQPOLL_NONFIXED feature flag. In version 5.11 and later, it is no longer necessary to register files to use this feature. 5.11 also allows using this as non-root, if the user has the CAP_SYS_NICE capability.

chrisvest commented 2 years ago

Possible lead

We don't use SQPOLL, so that doesn't apply.

chrisvest commented 2 years ago

Closing this since I think it's a kernel bug.