tempesta-tech / tempesta

All-in-one solution for high performance web content delivery and advanced protection against DDoS and web attacks
https://tempesta-tech.com/
GNU General Public License v2.0
613 stars 103 forks source link

Kernel BUG under MTU 80 #2059

Open EvgeniiMekhanik opened 6 months ago

EvgeniiMekhanik commented 6 months ago

Start test: t_stress.test_stress.H2LoadStressMTU80.test [ 256.266865] [tempesta fw] Open listen socket on: 0.0.0.0:443 [ 256.282750] [tempesta fw] Tempesta FW is ready [ 256.923694] ------------[ cut here ]------------ [ 256.924052] kernel BUG at net/ipv4/tcp_input.c:1440! [ 256.924390] invalid opcode: 0000 [#1] SMP NOPTI [ 256.924710] CPU: 6 PID: 7478 Comm: h2load Tainted: G OE 5.10.35+ #273 [ 256.925254] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 256.925861] RIP: 0010:tcp_shifted_skb+0x32a/0x3b0 [ 256.926213] Code: ff 0f 0b 48 c7 c7 50 c4 35 92 e8 ce a8 16 00 0f 0b e9 9e fd ff ff 4c 89 e7 e8 b2 8f ae ff 49 89 85 20 08 00 00 e9 16 fe ff ff <0f> 0b 48 c7 c7 40 c4 35 92 e8 a5 a8 16 00 48 c7 43 50 00 00 00 00 [ 256.927483] RSP: 0018:ffffa815001fc9a0 EFLAGS: 00010246 [ 256.927829] RAX: 0000000000000010 RBX: ffff8c3e72494000 RCX: 0000000000000004 [ 256.928298] RDX: 0000000000000000 RSI: 000000000f50d071 RDI: ffff8c3fb828d580 [ 256.928879] RBP: ffffa815001fc9d0 R08: 0000000000000001 R09: 0000000000000000 [ 256.929406] R10: 0000000000000000 R11: ffff8c3e72499800 R12: ffff8c3e72499800 [ 256.929931] R13: ffff8c3fb828d580 R14: 0000000000000004 R15: 0000000000000090 [ 256.930427] FS: 00007fe2f2fff640(0000) GS:ffff8c4077d80000(0000) knlGS:0000000000000000 [ 256.930978] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 256.931700] CR2: 0000562b5920e870 CR3: 00000001581be000 CR4: 0000000000750ee0 [ 256.932292] PKRU: 55555554 [ 256.932494] Call Trace: [ 256.932669] [ 256.932830] tcp_sacktag_walk+0x37d/0x570 [ 256.933139] tcp_sacktag_write_queue+0x6b6/0x1620 [ 256.933477] ? tcp_rack_mark_lost+0x65/0x100 [ 256.933824] ? tfw_h2_find_not_closed_stream+0x1b/0x60 [tempesta_fw] [ 256.934284] ? bzero_fast+0xe/0x10 [tempesta_lib] [ 256.934718] tcp_ack+0xd9b/0x1760 [ 256.934967] tcp_rcv_established+0x1a5/0x6b0 [ 256.935265] tcp_v4_do_rcv+0x140/0x200 [ 256.935572] tcp_v4_rcv+0xcd0/0xe20 [ 256.935864] ip_protocol_deliver_rcu+0x44/0x230 [ 256.936175] ip_local_deliver_finish+0x48/0x60 [ 256.936488] ip_local_deliver+0xf8/0x110 [ 256.936760] ? ip_protocol_deliver_rcu+0x230/0x230 [ 256.937092] ip_rcv_finish+0x87/0xa0 [ 256.937346] ip_rcv+0xce/0xe0 [ 256.937557] ? ip_rcv_finish_core.constprop.0+0x470/0x470 [ 256.937925] netif_receive_skb_one_core+0x86/0xa0 [ 256.938260] netif_receive_skb+0x18/0x60 [ 256.938557] process_backlog+0x9e/0x170 [ 256.938831] net_rx_action+0x13b/0x430

Was reproduced when I work on 2047. Run http2_general.test_h2_headers.AddBackendLongHeaders.test, http2_general.test_h2_headers.TestIPv6.test_request_with_some_data , then run t_stress.test_stress.H2LoadStressMTU80.test

EvgeniiMekhanik commented 6 months ago

ee2d801f083ab66f0abeb88953dfe8363e0c9584 Tempesta FW 22b085391310464e2ea26d34723c0b7a3a807aa5 tempesta-test

krizhanovsky commented 6 months ago

Probably one more trace for this bug, but maybe another, frequently appearing on the webserver

2024-03-06 17:19:29.258 
[598756.795801] WARNING: CPU: 19 PID: 0 at net/ipv4/tcp_input.c:1958 tcp_sacktag_write_queue+0x71b/0x780
2024-03-06 17:19:29.258 
[598756.795228] ------------[ cut here ]------------
2024-03-06 17:19:29.258 
[598756.794712] ---[ end trace d2a3a97994ee5a52 ]---
2024-03-06 17:19:29.258 
[598756.794327]  secondary_startup_64_no_verify+0xb0/0xbb
2024-03-06 17:19:29.258 
[598756.793906]  cpu_startup_entry+0x14/0x20
2024-03-06 17:19:29.258 
[598756.793569]  do_idle+0x68/0xb0
2024-03-06 17:19:29.258 
[598756.792901]  ? tsc_verify_tsc_adjust+0x2d/0xc0
2024-03-06 17:19:29.258 
[598756.792411]  cpuidle_idle_call+0x133/0x190
2024-03-06 17:19:29.258 
[598756.791681]  default_idle_call+0x2d/0xa0
2024-03-06 17:19:29.258 
[598756.790869] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
2024-03-06 17:19:29.258 
[598756.790194] R10: 000000000002e42e R11: 0000000000000000 R12: 0000000000000000
2024-03-06 17:19:29.258 
[598756.789488] RBP: 0000000000000000 R08: 000000007c5145a0 R09: 0000000000000000
2024-03-06 17:19:29.258 
[598756.788704] RDX: 0000000000957312 RSI: 0000000000000013 RDI: ffff9cef2fcde180
2024-03-06 17:19:29.258 
[598756.788155] RAX: ffff9cef2fceab40 RBX: ffff9cedc023d700 RCX: ffff9cef2fceab40
2024-03-06 17:19:29.258 
[598756.786470] RSP: 0018:ffffad1a40103ef0 EFLAGS: 00000206
2024-03-06 17:19:29.258 
[598756.785864] Code: fb 65 48 8b 04 25 00 6d 01 00 f0 80 60 02 df c3 0f ae f0 0f ae 38 0f ae f0 eb b6 90 e9 07 00 00 00 0f 00 2d 16 c8 47 00 fb f4 <c3> cc 65 8b 15 c9 89 88 59 89 d2 48 8b 05 30 26 67 01 48 03 04 d5
2024-03-06 17:19:29.258 
[598756.785434] RIP: 0010:default_idle+0xe/0x10
2024-03-06 17:19:29.258 
[598756.784850]  asm_sysvec_call_function_single+0x12/0x20
2024-03-06 17:19:29.258 
[598756.784412]  sysvec_call_function_single+0x2e/0x80
2024-03-06 17:19:29.258 
[598756.783912]  irq_exit_rcu+0x83/0xb0
2024-03-06 17:19:29.258 
[598756.783676]  do_softirq_own_stack+0x32/0x40
2024-03-06 17:19:29.258 
[598756.783131]  </IRQ>
2024-03-06 17:19:29.258 
[598756.782817]  asm_call_irq_on_stack+0xf/0x20
2024-03-06 17:19:29.258 
[598756.782431]  __do_softirq+0xc0/0x26e
2024-03-06 17:19:29.258 
[598756.781822]  net_rx_action+0x95/0x180
2024-03-06 17:19:29.258 
[598756.781501]  napi_poll+0x8a/0x1b0
2024-03-06 17:19:29.258 
[598756.780959]  process_backlog+0x83/0x120
2024-03-06 17:19:29.258 
[598756.780459]  __netif_receive_skb_one_core+0x40/0x50
2024-03-06 17:19:29.258 
[598756.779835]  ip_local_deliver_finish+0x3f/0x50
2024-03-06 17:19:29.258 
[598756.779490]  ip_protocol_deliver_rcu+0x14/0x170
2024-03-06 17:19:29.258 
[598756.778815]  ? tcp_v4_early_demux+0xe7/0x140
2024-03-06 17:19:29.258 
[598756.778334]  ? tcp_v4_early_demux+0xa3/0x140
2024-03-06 17:19:29.258 
[598756.778043]  tcp_v4_rcv+0xbb4/0xce0
2024-03-06 17:19:29.258 
[598756.777407]  tcp_v4_do_rcv+0x126/0x1e0
2024-03-06 17:19:29.258 
[598756.776899]  ? tfw_filter_check_ip+0x3a/0xa0 [tempesta_fw]
2024-03-06 17:19:29.258 
[598756.776527]  tcp_rcv_established+0x173/0x620
2024-03-06 17:19:29.258 
[598756.776003]  ? tcp_write_xmit+0x399/0x9c0
2024-03-06 17:19:29.258 
[598756.775603]  tcp_ack+0x4a2/0x810
2024-03-06 17:19:29.258 
[598756.775208]  ? free_unref_page_commit+0x86/0x110
2024-03-06 17:19:29.258 
[598756.774869]  <IRQ>
2024-03-06 17:19:29.257 
[598756.774404] Call Trace:
2024-03-06 17:19:29.257 
[598756.773830] PKRU: 55555554
2024-03-06 17:19:29.257 
[598756.773135] CR2: 00007f7057397f58 CR3: 0000000302364001 CR4: 0000000000770ee0
2024-03-06 17:19:29.257 
[598756.772393] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2024-03-06 17:19:29.257 
[598756.771743] FS:  0000000000000000(0000) GS:ffff9cef2fcc0000(0000) knlGS:0000000000000000
2024-03-06 17:19:29.257 
[598756.770917] R13: ffff9cedcc1daf80 R14: 00000000ef3bb94e R15: 0000000000000003
2024-03-06 17:19:29.257 
[598756.770373] R10: 0000000000000000 R11: 00000000ef3d3232 R12: 0000000000000003
2024-03-06 17:19:29.257 
[598756.769699] RBP: ffff9cedccce7b00 R08: 0000000000000000 R09: ffffad1a40458d50
2024-03-06 17:19:29.257 
[598756.768748] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff9cedcc1daf80
2024-03-06 17:19:29.257 
[598756.768229] RAX: 0000000000000000 RBX: ffff9cedccce7c84 RCX: 0000000000000030
2024-03-06 17:19:29.257 
[598756.766758] RSP: 0018:ffffad1a40458ca8 EFLAGS: 00010283
2024-03-06 17:19:29.257 
[598756.765962] Code: 68 3b 47 2c 79 55 48 83 c3 08 e9 58 fc ff ff 48 83 c7 08 48 8b 3f 48 85 ff 0f 85 b3 fe ff ff e9 c0 fe ff ff 48 83 c7 10 eb e9 <0f> 0b e9 81 fa ff ff 0f 0b e9 6e fa ff ff 0f 0b e9 58 fa ff ff 0f
2024-03-06 17:19:29.257 
[598756.765299] RIP: 0010:tcp_sacktag_write_queue+0x71b/0x780
2024-03-06 17:19:29.257 
[598756.764441] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
2024-03-06 17:19:29.257 
[598756.761163] CPU: 19 PID: 0 Comm: swapper/19 Tainted: G        W  O      5.10.35-tfw-secperf+ #1
2024-03-06 17:19:29.257 
[598756.759920] Modules linked in: tempesta_fw(O) tempesta_db(O) tempesta_tls(O) tempesta_lib(O) sg serio_raw virtio_balloon virtio_console button loop sch_fq_codel msr drm efi_pstore configfs ip_tables x_tables ext4 crc16 mbcache jbd2 efivars linear md_mod sr_mod cdrom virtio_net virtio_rng net_failover rng_core virtio_blk failover ahci libahci libata crct10dif_pclmul scsi_mod psmouse i2c_i801 lpc_ich i2c_smbus mfd_core virtio_pci virtio_ring virtio
EvgeniiMekhanik commented 6 months ago

2024-03-05 11:00:26.002 [489613.530884] WARNING: CPU: 18 PID: 0 at net/ipv4/tcp_input.c:2899 tcp_fastretrans_alert+0x5eb/0xa00

krizhanovsky commented 4 months ago

Both the 2 warnings (https://github.com/tempesta-tech/tempesta/issues/2059#issuecomment-1954544110 and https://github.com/tempesta-tech/tempesta/issues/2059#issuecomment-1983814692) are fixed in #2079.

The initial BUG_ON() isn't fixed yet and did not reporduced.