tempesta-tech / tempesta

All-in-one solution for high performance web content delivery and advanced protection against DDoS and web attacks
https://tempesta-tech.com/
GNU General Public License v2.0
616 stars 103 forks source link

Kernel crashes under concurrent connections tests #1703

Closed b3b closed 1 year ago

b3b commented 2 years ago

Kernel panic occurs when:

Example command to create a load: wrk -d 120 -t10 -c 100 http://127.0.0.1:80/

kernel BUG at net/core/skbuff.c:2169!
[  140.405287] [tempesta fw]   frang: check request for client 192.168.122.1, acc=000000005a34ff45
[  140.407579] [tempesta fw]   frang: check incomplete request headers for client 192.168.122.1, acc=000000005a34ff45
[  140.408795] [tempesta fw]   frang: check incomplete request body for client 192.168.122.1, acc=000000005a34ff45
[  140.409860] [tempesta fw]   frang: checks done for client 192.168.122.1
[  140.411014] [tempesta fw]   frang: check request for client 192.168.122.1, acc=000000005a34ff45
[  140.412054] [tempesta fw]   frang: check incomplete request headers for client 192.168.122.1, acc=000000005a34ff45
[  140.413124] [tempesta fw]   frang: check incomplete request body for client 192.168.122.1, acc=000000005a34ff45
[  140.414166] [tempesta fw]   frang: checks done for client 192.168.122.1
[  141.198111] ------------[ cut here ]------------
[  141.199423] kernel BUG at net/core/skbuff.c:2169!
[  141.200274] invalid opcode: 0000 [#1] SMP NOPTI
[  141.200969] CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G           OE     5.10.35+ #1
[  141.202020] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[  141.203051] RIP: 0010:skb_put+0x1e/0x50
[  141.203786] Code: 41 bc f4 ff ff ff eb de 0f 1f 40 00 0f 1f 44 00 00 8b 87 b8 00 00 00 48 89 c2 48 03 87 c0 00 00 00 f6 87 83 00 00 00 01 74 02 <0f> 0b 8b 4f 74 85 c9 75 f7 01 f2 01 77 70 89 97 b8 00 00 00 3b 97
[  141.206610] RSP: 0018:ffffb722000c8850 EFLAGS: 00010202
[  141.207634] RAX: ffff8c626522d280 RBX: 00000000000005a8 RCX: 0000000000000000
[  141.209026] RDX: 0000000000000180 RSI: 00000000000005a8 RDI: ffff8c626522d000
[  141.210399] RBP: ffffb722000c8928 R08: 0000000000000042 R09: ffff8c626522d100
[  141.211763] R10: 00000000000007c0 R11: 0000000000000000 R12: 0000000000000000
[  141.213133] R13: 00000000000005a8 R14: ffff8c6261ece0e0 R15: ffff8c626522d000
[  141.214496] FS:  0000000000000000(0000) GS:ffff8c627bd00000(0000) knlGS:0000000000000000
[  141.215946] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  141.217217] CR2: 00007f1d3b6ca120 CR3: 0000000123f4c006 CR4: 0000000000770ee0
[  141.218581] PKRU: 55555554
[  141.219525] Call Trace:
[  141.220442]  
[  141.221770]  ? skb_segment+0x47c/0xdf0
[  141.222802]  tcp_gso_segment+0xf0/0x520
[  141.223813]  ? tcp_rtx_synack+0xa0/0xa0
[  141.224826]  tcp4_gso_segment+0x53/0xd0
[  141.225821]  inet_gso_segment+0x14d/0x3b0
[  141.226844]  ? tfw_http_parse_resp+0x966/0xfdc0 [tempesta_fw]
[  141.228021]  skb_mac_gso_segment+0x99/0x110
[  141.229035]  __skb_gso_segment+0xce/0x190
[  141.230030]  validate_xmit_skb+0x16d/0x370
[  141.231024]  validate_xmit_skb_list+0x4d/0x70
[  141.232066]  sch_direct_xmit+0x150/0x340
[  141.233037]  __dev_queue_xmit+0x5ad/0x910
[  141.234019]  ? update_load_avg+0x7c/0x630
[  141.234982]  dev_queue_xmit+0x10/0x20
[  141.235911]  ip_finish_output2+0x269/0x590
[  141.236873]  ? __cgroup_bpf_run_filter_skb+0x3c3/0x3d0
[  141.238189]  __ip_finish_output+0xd8/0x220
[  141.239149]  ip_finish_output+0x2d/0xb0
[  141.240077]  ip_output+0x7a/0x100
[  141.240843]  ip_local_out+0x3d/0x50
[  141.241582]  __ip_queue_xmit+0x17a/0x470
[  141.242327]  ip_queue_xmit+0x15/0x20
[  141.243038]  __tcp_transmit_skb+0xa20/0xbe0
[  141.243793]  tcp_write_xmit+0x453/0x1350
[  141.244529]  __tcp_push_pending_frames+0x37/0x100
[  141.245311]  tcp_push+0xfc/0x100
[  141.245994]  ss_tx_action+0x2f7/0x5f0 [tempesta_fw]
[  141.246789]  net_tx_action+0x9c/0x250
[  141.247480]  __do_softirq+0xd9/0x291
[  141.248178]  asm_call_irq_on_stack+0xf/0x20
[  141.248952]  
[  141.249555]  do_softirq_own_stack+0x3d/0x50
[  141.250424]  irq_exit_rcu+0xa4/0xb0
[  141.251898]  sysvec_call_function_single+0x3d/0x90
[  141.253864]  asm_sysvec_call_function_single+0x12/0x20
[  141.255526] RIP: 0010:native_safe_halt+0xe/0x10
[  141.256393] Code: 7b ff ff ff eb bd cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 26 bf 47 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 16 bf 47 00 fb f4  cc 0f 1f 44 00 00 55 48 89 e5 53 65 8b 15 9f 0d 48 79 0f 1f 44
[  141.259014] RSP: 0018:ffffb7220008be90 EFLAGS: 00000202
[  141.259956] RAX: ffffffff86b90630 RBX: 0000000000000001 RCX: ffff8c627bd2cdc0
[  141.261077] RDX: 0000000000a8908a RSI: 0000000000000087 RDI: 0000000000000087
[  141.262178] RBP: ffffb7220008be98 R08: 000009a8dd69e3ea R09: 00000020d9c873fc
[  141.263281] R10: 000000000000000f R11: 0000000000000000 R12: ffff8c62602abc80
[  141.264365] R13: ffff8c62602abc80 R14: 0000000000000000 R15: 0000000000000000
[  141.265467]  ? __cpuidle_text_start+0x8/0x8
[  141.266293]  ? default_idle+0xe/0x20
[  141.267057]  arch_cpu_idle+0x15/0x20
[  141.267839]  default_idle_call+0x38/0xc0
[  141.268726]  do_idle+0x1f8/0x260
[  141.270052]  ? do_idle+0xf/0x260
[  141.271156]  cpu_startup_entry+0x20/0x30
[  141.272364]  start_secondary+0x111/0x150
[  141.273569]  secondary_startup_64_no_verify+0xb0/0xbb
[  141.274649] Modules linked in: tempesta_fw(OE) tempesta_db(OE) sha256_ssse3 sha512_ssse3 tempesta_tls(OE) tempesta_lib(OE) nls_utf8 isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common kvm_intel kvm ppdev parport_pc joydev input_leds parport mac_hid serio_raw qemu_fw_cfg sch_fq_codel ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel bochs_drm crypto_simd drm_vram_helper drm_ttm_helper cryptd glue_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec psmouse drm virtio_net net_failover i2c_piix4 failover pata_acpi floppy

Related issues: #1377

Testing

Test to reproduce: stress.test_stress

b3b commented 2 years ago

Crash is also reproduced with HTTPS and HTTP/2 tests: stress.test_stress.H2CurlStress

A few more crashes occurs (not reproduced):

skb_copy_and_csum_bits ... kernel BUG at net/core/skbuff.c:2169
[   12.080038] ------------[ cut here ]------------
[   12.081919] kernel BUG at net/core/skbuff.c:2169!
[   12.082252] invalid opcode: 0000 [#1] SMP NOPTI
[   12.082615] CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G           OE     5.10.35+ #1
[   12.083198] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[   12.083750] RIP: 0010:skb_put+0x1e/0x50
[   12.084009] Code: 41 bc f4 ff ff ff eb de 0f 1f 40 00 0f 1f 44 00 00 8b 87 b8 00 00 00 48 89 c2 48 03 87 c0 00 00 00 f6 87 83 00 00 00 01 74 02 <0f> 0b 8b 4f 74 85 c9 75 f7 01 f2 01 77 70 89 97 b8 00 00 00 3b 97
[   12.085230] RSP: 0018:ffffae8d400c85b0 EFLAGS: 00010202
[   12.085578] RAX: ffff898888996253 RBX: 00000000000005a8 RCX: 0000000000000000
[   12.086047] RDX: 0000000000000153 RSI: 00000000000005a8 RDI: ffff898888996000
[   12.086516] RBP: ffffae8d400c8688 R08: 0000000000000042 R09: ffff898888996100
[   12.086992] R10: 00000000000007c0 R11: 0000000000000000 R12: 0000000000000000
[   12.087461] R13: 00000000000005a8 R14: ffff898893e4e8e0 R15: ffff898888996000
[   12.087931] FS:  0000000000000000(0000) GS:ffff8989bbd00000(0000) knlGS:0000000000000000
[   12.088462] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   12.088841] CR2: 0000559c0b3221a0 CR3: 0000000125560002 CR4: 0000000000770ee0
[   12.089313] PKRU: 55555554
[   12.089497] Call Trace:
[   12.089670]  
[   12.089812]  ? skb_segment+0x47c/0xdf0
[   12.090065]  ? skb_copy_and_csum_bits+0x103/0x250
[   12.090385]  tcp_gso_segment+0xf0/0x520
[   12.090645]  ? tcp_rtx_synack+0xa0/0xa0
[   12.090907]  tcp4_gso_segment+0x53/0xd0
[   12.091165]  inet_gso_segment+0x14d/0x3b0
[   12.091435]  ? napi_alloc_frag+0x20/0x30
[   12.091699]  skb_mac_gso_segment+0x99/0x110
[   12.091982]  __skb_gso_segment+0xce/0x190
[   12.092251]  validate_xmit_skb+0x16d/0x370
[   12.092525]  validate_xmit_skb_list+0x4d/0x70
[   12.092818]  sch_direct_xmit+0x150/0x340
[   12.093081]  __qdisc_run+0x149/0x560
[   12.093330]  __dev_queue_xmit+0x5c4/0x910
[   12.093612]  ? tfw_tls_encrypt+0x560/0x850 [tempesta_fw]
[   12.093965]  dev_queue_xmit+0x10/0x20
[   12.094212]  ip_finish_output2+0x269/0x590
[   12.094489]  ? __cgroup_bpf_run_filter_skb+0x3c3/0x3d0
[   12.094881]  __ip_finish_output+0xd8/0x220
[   12.095158]  ip_finish_output+0x2d/0xb0
[   12.095852]  ip_output+0x7a/0x100
[   12.096457]  ip_local_out+0x3d/0x50
[   12.097178]  __ip_queue_xmit+0x17a/0x470
[   12.098042]  ip_queue_xmit+0x15/0x20
[   12.098622]  __tcp_transmit_skb+0xa20/0xbe0
[   12.099246]  tcp_write_xmit+0x453/0x1350
[   12.099838]  ? __kfree_skb+0x1b/0x20
[   12.100355]  __tcp_push_pending_frames+0x37/0x100
[   12.100931]  tcp_rcv_established+0x26c/0x670
[   12.101499]  ? sk_filter_trim_cap+0xde/0x240
[   12.102078]  tcp_v4_do_rcv+0x140/0x200
[   12.102593]  tcp_v4_rcv+0xcfd/0xe10
[   12.103156]  ip_protocol_deliver_rcu+0x30/0x1b0
[   12.103774]  ip_local_deliver_finish+0x48/0x60
[   12.104318]  ip_local_deliver+0x72/0x110
[   12.104804]  ? tcp_v4_early_demux+0xfa/0x160
[   12.105305]  ? ip_rcv_finish_core.isra.0+0x146/0x420
[   12.105852]  ip_rcv_finish+0x87/0xa0
[   12.106309]  ip_rcv+0xcc/0xe0
[   12.106728]  ? ip_rcv_finish_core.isra.0+0x420/0x420
[   12.107326]  __netif_receive_skb_one_core+0x88/0xa0
[   12.107912]  __netif_receive_skb+0x18/0x60
[   12.108441]  process_backlog+0xa9/0x160
[   12.108949]  net_rx_action+0x13e/0x390
[   12.109449]  ? vring_interrupt+0x5f/0xc0
[   12.109953]  __do_softirq+0xd9/0x291
[   12.110436]  asm_call_irq_on_stack+0xf/0x20
[   12.110966]  
[   12.111359]  do_softirq_own_stack+0x3d/0x50
[   12.111882]  irq_exit_rcu+0xa4/0xb0
[   12.112317]  sysvec_call_function_single+0x3d/0x90
[   12.112829]  asm_sysvec_call_function_single+0x12/0x20
[   12.113937] RIP: 0010:native_safe_halt+0xe/0x10
[   12.114453] Code: 7b ff ff ff eb bd cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 26 bf 47 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 16 bf 47 00 fb f4  cc 0f 1f 44 00 00 55 48 89 e5 53 65 8b 15 9f 0d c8 54 0f 1f 44
[   12.116195] RSP: 0018:ffffae8d4008be90 EFLAGS: 00000202
[   12.116826] RAX: ffffffffab390630 RBX: 0000000000000001 RCX: ffff8989bbd2cdc0
[   12.117534] RDX: 000000000002c8da RSI: 00000002ca0d613e RDI: 0000000000000082
[   12.118243] RBP: ffffae8d4008be98 R08: 000000cd42e4dffb R09: 00000002cb7b973e
[   12.119007] R10: 0000000000000002 R11: 000000000000002a R12: ffff8989a02a8000
[   12.119776] R13: ffff8989a02a8000 R14: 0000000000000000 R15: 0000000000000000
[   12.120571]  ? __cpuidle_text_start+0x8/0x8
[   12.121161]  ? default_idle+0xe/0x20
[   12.121713]  arch_cpu_idle+0x15/0x20
[   12.122253]  default_idle_call+0x38/0xc0
[   12.122832]  do_idle+0x1f8/0x260
[   12.123373]  ? complete+0x3f/0x50
[   12.123893]  cpu_startup_entry+0x20/0x30
[   12.124455]  start_secondary+0x111/0x150
[   12.125022]  secondary_startup_64_no_verify+0xb0/0xbb
[   12.125656] Modules linked in: tempesta_fw(OE) tempesta_db(OE) sha256_ssse3 sha512_ssse3 tempesta_tls(OE) tempesta_lib(OE) nls_utf8 isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common kvm_intel kvm ppdev joydev input_leds mac_hid parport_pc parport serio_raw qemu_fw_cfg sch_fq_codel ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel bochs_drm drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec floppy aesni_intel crypto_simd cryptd glue_helper psmouse drm virtio_net net_failover failover i2c_piix4 pata_acpi
BUG: kernel NULL pointer dereference, address: 0000000000000030
[  130.651577] ------------[ cut here ]------------
[  130.652572] WARNING: CPU: 0 PID: 12 at /home/ubuntu/tempesta_dev/fw/http_frame.c:232 tfw_h2_context_clear+0x1b/0x30 [tempesta_fw]
[  130.654132] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) sha256_ssse3 sha512_ssse3 nls_utf8 isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common kvm_intel kvm ppdev input_leds joydev serio_raw mac_hid qemu_fw_cfg parport_pc parport sch_fq_codel ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel bochs_drm drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm aesni_intel crypto_simd cryptd glue_helper psmouse virtio_net net_failover failover i2c_piix4 pata_acpi floppy [last unloaded: tempesta_lib]
[  130.662848] CPU: 0 PID: 12 Comm: ksoftirqd/0 Kdump: loaded Tainted: G    B   W  OE     5.10.35+ #1
[  130.663925] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[  130.665083] RIP: 0010:tfw_h2_context_clear+0x1b/0x30 [tempesta_fw]
[  130.666162] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 83 7f 38 00 48 89 e5 75 0b 48 83 c7 68 e8 e7 18 05 00 5d c3 <0f> 0b 48 83 c7 68 e8 da 18 05 00 5d c3 0f 1f 84 00 00 00 00 00 0f
[  130.669287] RSP: 0018:ffffa2a94006bc80 EFLAGS: 00010202
[  130.670129] RAX: 0000000000000003 RBX: ffffa2a94006ba80 RCX: 0000000000000000
[  130.671076] RDX: ffff94d8e45641c0 RSI: 000000000000000a RDI: ffff94d8e45648b0
[  130.672026] RBP: ffffa2a94006bc80 R08: 0000000000000000 R09: 0000000000000000
[  130.673008] R10: ffff94d8fffd3680 R11: 00000000fffffffc R12: ffff94d8e45641c0
[  130.673928] R13: ffff94d8e4564530 R14: ffff94d8e43f1c80 R15: ffff94d8fbc2fd80
[  130.674831] FS:  0000000000000000(0000) GS:ffff94d8fbc00000(0000) knlGS:0000000000000000
[  130.675838] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  130.676907] CR2: 0000557dd1f0d774 CR3: 0000000128c30006 CR4: 0000000000770ef0
[  130.677860] PKRU: 55555554
[  130.678846] Call Trace:
[  130.679501]  tfw_tls_conn_dtor+0x193/0x1a0 [tempesta_fw]
[  130.680395]  tfw_http_conn_msg_free+0x7f/0x90 [tempesta_fw]
[  130.681250]  tfw_http_conn_release+0x1d0/0x280 [tempesta_fw]
[  130.682092]  tfw_connection_release+0x2b/0x50 [tempesta_fw]
[  130.682922]  tfw_srv_conn_release+0x13/0xc0 [tempesta_fw]
[  130.683753]  tfw_sock_srv_connect_drop+0x176/0x180 [tempesta_fw]
[  130.684680]  ss_conn_drop_guard_exit+0x26/0x50 [tempesta_fw]
[  130.685493]  ss_tx_action+0x429/0x5f0 [tempesta_fw]
[  130.686204]  net_tx_action+0x9c/0x250
[  130.686838]  __do_softirq+0xd9/0x291
[  130.687460]  run_ksoftirqd+0x2b/0x40
[  130.688187]  smpboot_thread_fn+0xd0/0x170
[  130.688884]  kthread+0x114/0x150
[  130.689474]  ? sort_range+0x30/0x30
[  130.690081]  ? kthread_park+0x90/0x90
[  130.690746]  ret_from_fork+0x1f/0x30
[  130.691401] ---[ end trace c21f1098a59c39ed ]---
[  130.692205] [tempesta fw]     put client 00000000fb17ae58, users=148
[  130.693056] [tempesta fw]       connection lost: close client socket: sk=0000000015777868, conn=0000000019e1fdc8, client=00000000fb17ae58
[  130.695460] [tempesta fw]     put client 00000000fb17ae58, users=147
[  130.696422] [tempesta fw]       connection lost: close client socket: sk=00000000c3d2ff40, conn=00000000cf03db39, client=00000000fb17ae58
[  130.698260] [tempesta fw]     put client 00000000fb17ae58, users=146
[  130.699048] [tempesta fw]       connection lost: close client socket: sk=0000000035fe1062, conn=00000000f59db90e, client=00000000fb17ae58
[  130.700979] [tempesta fw]     put client 00000000fb17ae58, users=145
[  130.701795] [tempesta fw]       connection lost: close client socket: sk=0000000068d3266a, conn=00000000e1bd3324, client=00000000fb17ae58
[  130.703704] [tempesta fw]     put client 00000000fb17ae58, users=144
[  130.704610] [tempesta fw]       connection lost: close client socket: sk=0000000055c81db7, conn=0000000071341ccb, client=00000000fb17ae58
[  130.706326] [tempesta fw]     put client 00000000fb17ae58, users=143
[  130.707111] [tempesta fw]       connection lost: close client socket: sk=0000000037c628b1, conn=00000000a2bec1a7, client=00000000fb17ae58
[  130.708945] [tempesta fw]     put client 00000000fb17ae58, users=142
[  130.709746] [tempesta fw]       connection lost: close client socket: sk=0000000082ffca74, conn=00000000b2e8da5f, client=00000000fb17ae58
[  130.712025] [tempesta fw]     put client 00000000fb17ae58, users=141
[  130.712902] [tempesta fw]       connection lost: close client socket: sk=00000000014cb842, conn=00000000c3058d07, client=00000000fb17ae58
[  130.714660] [tempesta fw]     put client 00000000fb17ae58, users=140
[  130.715440] [tempesta fw]       connection lost: close client socket: sk=00000000e223cda6, conn=000000005c46652f, client=00000000fb17ae58
[  130.717309] [tempesta fw]     put client 00000000fb17ae58, users=139
[  130.718160] [tempesta fw]       connection lost: close client socket: sk=00000000bb0a8040, conn=0000000052ce065a, client=00000000fb17ae58
[  130.720282] [tempesta fw]     put client 00000000fb17ae58, users=138
[  130.721255] [tempesta fw]       connection lost: close client socket: sk=00000000611f248f, conn=0000000057a78aa0, client=00000000fb17ae58
[  130.723156] [tempesta fw]     put client 00000000fb17ae58, users=137
[  130.724062] [tempesta fw]       connection lost: close client socket: sk=00000000c1bf01ef, conn=00000000ece05dff, client=00000000fb17ae58
[  130.726143] [tempesta fw]     put client 00000000fb17ae58, users=136
[  130.727139] [tempesta fw]       connection lost: close client socket: sk=0000000079232a86, conn=000000005a972383, client=00000000fb17ae58
[  130.729201] [tempesta fw]     put client 00000000fb17ae58, users=135
[  130.730099] [tempesta fw]       connection lost: close client socket: sk=00000000aa7a5fc0, conn=00000000953d3207, client=00000000fb17ae58
[  130.732287] [tempesta fw]     put client 00000000fb17ae58, users=134
[  130.733206] [tempesta fw]       connection lost: close client socket: sk=00000000fa7b8b49, conn=0000000052e75f35, client=00000000fb17ae58
[  130.735291] [tempesta fw]     put client 00000000fb17ae58, users=133
[  130.736346] [tempesta fw]       connection lost: close client socket: sk=00000000bec5e9fe, conn=0000000085a80cd1, client=00000000fb17ae58
[  130.738476] [tempesta fw]     put client 00000000fb17ae58, users=132
[  130.739384] [tempesta fw]       connection lost: close client socket: sk=000000003b46704d, conn=000000008e5adb1e, client=00000000fb17ae58
[  130.741726] [tempesta fw]     put client 00000000fb17ae58, users=131
[  130.743149] [tempesta fw]       connection lost: close client socket: sk=00000000675790d8, conn=00000000390186fd, client=00000000fb17ae58
[  130.745340] [tempesta fw]     put client 00000000fb17ae58, users=130
[  130.746286] [tempesta fw]       connection lost: close client socket: sk=00000000a24e5023, conn=00000000f0236005, client=00000000fb17ae58
[  130.748595] [tempesta fw]     put client 00000000fb17ae58, users=129
[  130.749544] [tempesta fw]       connection lost: close client socket: sk=00000000531c6ccb, conn=00000000c8ede0d3, client=00000000fb17ae58
[  130.751879] [tempesta fw]     put client 00000000fb17ae58, users=128
[  130.752958] [tempesta fw]       connection lost: close client socket: sk=000000005a041a89, conn=000000008b800f6d, client=00000000fb17ae58
[  130.755084] [tempesta fw]     put client 00000000fb17ae58, users=127
[  130.756073] [tempesta fw]       connection lost: close client socket: sk=00000000d0498e1c, conn=000000002b75d891, client=00000000fb17ae58
[  130.758403] [tempesta fw]     put client 00000000fb17ae58, users=126
[  130.759406] [tempesta fw]       connection lost: close client socket: sk=00000000506a1f38, conn=000000003304283b, client=00000000fb17ae58
[  130.761528] [tempesta fw]     put client 00000000fb17ae58, users=125
[  130.762421] [tempesta fw]       connection lost: close client socket: sk=000000003f005dd1, conn=00000000d46e7b2f, client=00000000fb17ae58
[  130.764509] [tempesta fw]     put client 00000000fb17ae58, users=124
[  130.765402] [tempesta fw]       connection lost: close client socket: sk=000000006a39f602, conn=0000000070990bc1, client=00000000fb17ae58
[  130.767437] [tempesta fw]     put client 00000000fb17ae58, users=123
[  130.768435] [tempesta fw]       connection lost: close client socket: sk=000000008685bed5, conn=0000000014829d25, client=00000000fb17ae58
[  130.770816] [tempesta fw]     put client 00000000fb17ae58, users=122
[  130.771732] [tempesta fw]       connection lost: close client socket: sk=00000000f50d08c2, conn=00000000dafd4199, client=00000000fb17ae58
[  130.773819] [tempesta fw]     put client 00000000fb17ae58, users=121
[  130.775219] [tempesta fw]       connection lost: close client socket: sk=00000000cf65405d, conn=0000000056d4cb14, client=00000000fb17ae58
[  130.777318] [tempesta fw]     put client 00000000fb17ae58, users=120
[  130.778243] [tempesta fw]       connection lost: close client socket: sk=0000000094e18a3c, conn=000000003126ff6e, client=00000000fb17ae58
[  130.780487] [tempesta fw]     put client 00000000fb17ae58, users=119
[  130.781386] [tempesta fw]       connection lost: close client socket: sk=00000000b0ed02cb, conn=00000000f73b1418, client=00000000fb17ae58
[  130.783408] [tempesta fw]     put client 00000000fb17ae58, users=118
[  130.784397] [tempesta fw]       connection lost: close client socket: sk=00000000888555cb, conn=0000000055d77e97, client=00000000fb17ae58
[  130.786437] [tempesta fw]     put client 00000000fb17ae58, users=117
[  130.787322] [tempesta fw]       connection lost: close client socket: sk=00000000e38a497a, conn=0000000033bf7499, client=00000000fb17ae58
[  130.789397] [tempesta fw]     put client 00000000fb17ae58, users=116
[  130.790347] [tempesta fw]       connection lost: close client socket: sk=000000004e68bdf9, conn=00000000f6e38e13, client=00000000fb17ae58
[  130.792568] [tempesta fw]     put client 00000000fb17ae58, users=115
[  130.793465] [tempesta fw]       connection lost: close client socket: sk=000000006dd8da2b, conn=000000002235c1fa, client=00000000fb17ae58
[  130.795483] [tempesta fw]     put client 00000000fb17ae58, users=114
[  130.796483] [tempesta fw]       connection lost: close client socket: sk=000000008b659ad5, conn=0000000082a8676f, client=00000000fb17ae58
[  130.798519] [tempesta fw]     put client 00000000fb17ae58, users=113
[  130.799409] [tempesta fw]       connection lost: close client socket: sk=0000000041626c1f, conn=000000005f699bb3, client=00000000fb17ae58
[  130.801455] [tempesta fw]     put client 00000000fb17ae58, users=112
[  130.802340] [tempesta fw]       connection lost: close client socket: sk=000000004c5dd308, conn=00000000a3ae0d94, client=00000000fb17ae58
[  130.804520] [tempesta fw]     put client 00000000fb17ae58, users=111
[  130.805414] [tempesta fw]       connection lost: close client socket: sk=00000000915014a1, conn=00000000cdb258da, client=00000000fb17ae58
[  130.807925] [tempesta fw]     put client 00000000fb17ae58, users=110
[  130.808939] [tempesta fw]       connection lost: close client socket: sk=00000000fc4ab14a, conn=000000001c9bf48d, client=00000000fb17ae58
[  130.810998] [tempesta fw]     put client 00000000fb17ae58, users=109
[  130.811938] [tempesta fw]       connection lost: close client socket: sk=000000005a194038, conn=00000000c931f4a2, client=00000000fb17ae58
[  130.813989] [tempesta fw]     put client 00000000fb17ae58, users=108
[  130.814880] [tempesta fw]       connection lost: close client socket: sk=0000000042ee13cb, conn=000000008adde6d8, client=00000000fb17ae58
[  130.817028] [tempesta fw]     put client 00000000fb17ae58, users=107
[  130.817922] [tempesta fw]       connection lost: close client socket: sk=000000009c26283f, conn=000000007bab61f9, client=00000000fb17ae58
[  130.819959] [tempesta fw]     put client 00000000fb17ae58, users=106
[  130.820966] [tempesta fw]       connection lost: close client socket: sk=00000000f6abb91e, conn=00000000c1531316, client=00000000fb17ae58
[  130.823705] [tempesta fw]     put client 00000000fb17ae58, users=105
[  130.824749] [tempesta fw]       connection lost: close client socket: sk=000000000f8f54a6, conn=0000000030d16cbe, client=00000000fb17ae58
[  130.826807] [tempesta fw]     put client 00000000fb17ae58, users=104
[  130.827701] [tempesta fw]       connection lost: close client socket: sk=00000000d5930a87, conn=000000002708189e, client=00000000fb17ae58
[  130.829757] [tempesta fw]     put client 00000000fb17ae58, users=103
[  130.830649] [tempesta fw]       connection lost: close client socket: sk=0000000020b0489b, conn=0000000093aad9cb, client=00000000fb17ae58
[  130.832732] [tempesta fw]     put client 00000000fb17ae58, users=102
[  130.833623] [tempesta fw]       connection lost: close client socket: sk=00000000cffe50bb, conn=00000000881b77fb, client=00000000fb17ae58
[  130.835644] [tempesta fw]     put client 00000000fb17ae58, users=101
[  130.852187] [tempesta fw]     Stop scheduler 'ratio' for group 'default'
[  130.853205] [tempesta fw]     Remove backend server from group 'default'
[  130.880168] [tempesta fw]     mod_stop(): client
[  130.885285] [tdb] Close table 'client0.tdb'
[  130.886132] [tempesta fw]     mod_stop(): websocket
[  130.887292] [tempesta fw]     mod_stop(): http_sess
[  130.892762] [tdb] Close table 'sessions0.tdb'
[  130.893528] [tempesta fw]     mod_stop(): cache
[  130.894281] [tempesta fw]     mod_stop(): filter
[  130.928266] [tdb] Close table 'filter0.tdb'
[  130.929003] [tempesta fw]     mod_stop(): http
[  130.929704] [tempesta fw]     mod_stop(): tls
[  130.930379] [tempesta fw]     mod_stop(): vhost
[  130.931054] [tempesta fw]     mod_stop(): apm
[  130.931732] [tempesta fw]     mod_stop(): access_log
[  130.932535] [tempesta fw] modules are stopped
[  130.956152] [tempesta fw]     release group: 'default'
[  130.958478] [tempesta fw]   New configuration is cleaned.
[ 1396.194395] Start test: stress.test_stress.H2Stress.test_concurrent_connections
[ 1396.476870] [tempesta fw]     got state via sysctl: start
[ 1396.477751] [tempesta fw]     Prepare the configuration processing...
[ 1396.478666] [tempesta fw]     mod_cfgstart(): vhost
[ 1396.480136] [tempesta fw]     mod_cfgstart(): tls
[ 1396.480859] [tempesta fw]     mod_cfgstart(): http_sess
[ 1396.481598] [tempesta fw]     mod_cfgstart(): websocket
[ 1396.482405] [tempesta fw]     mod_cfgstart(): sock_srv
[ 1396.483594] [tempesta fw]   Create new server group: '__dfltopts__'
[ 1396.484582] [tempesta fw]   Create new server group: 'default'
[ 1396.485492] [tempesta fw]     mod_cfgstart(): http_tbl
[ 1396.486257] [tempesta fw]   Preparing for the configuration processing.
[ 1396.510062] [tempesta fw]   Add new server group: 'default'
[ 1396.511187] [tempesta fw]     Add new backend server to group 'default'
[ 1396.512201] [tempesta fw]     Completing the configuration processing...
[ 1396.512886] [tempesta fw]     mod_cfgend(): apm
[ 1396.513611] [tempesta fw]     mod_cfgend(): vhost
[ 1396.514587] [tempesta fw]     mod_cfgend(): tls
[ 1396.515134] [tempesta fw]     mod_cfgend(): websocket
[ 1396.515697] [tempesta fw]     mod_cfgend(): sock_srv
[ 1396.516286] [tempesta fw]     mod_cfgend(): sock_clnt
[ 1396.516868] [tempesta fw]   Checking backends and listeners
[ 1396.517620] [tempesta fw]       Checking server....
[ 1396.518184] [tempesta fw]       Iterating listener
[ 1396.538046] [tempesta fw]     mod_cfgend(): procfs
[ 1396.538581] [tempesta fw]     mod_cfgend(): http_tbl
[ 1396.539081] [tempesta fw] Configuration processing is completed.
[ 1396.539628] [tempesta fw]     starting modules...
[ 1396.540100] [tempesta fw]     mod_start(): vhost
[ 1396.554031] [tempesta fw]     mod_start(): tls
[ 1396.554529] [tempesta fw]     mod_start(): filter
[ 1396.576638] [tdb]   init db header: nwb=20480 db_size=16777216 rec_len=20
[ 1396.577759] [tdb] Opened table /opt/tempesta/db/filter0.tdb: size=16777216 rec_size=20 base=000000002282c18b
[ 1396.579062] [tempesta fw]     mod_start(): cache
[ 1396.579546] [tempesta fw]     mod_start(): http_sess
[ 1396.602252] [tdb]   init db header: nwb=20480 db_size=16777216 rec_len=312
[ 1396.603288] [tdb] Opened table /opt/tempesta/db/sessions0.tdb: size=16777216 rec_size=312 base=000000000417f987
[ 1396.604099] [tempesta fw]     mod_start(): websocket
[ 1396.604599] [tempesta fw]     mod_start(): client
[ 1396.629205] [tdb]   init db header: nwb=20480 db_size=16777216 rec_len=624
[ 1396.630279] [tdb] Opened table /opt/tempesta/db/client0.tdb: size=16777216 rec_size=624 base=00000000835c6dcf
[ 1396.631179] [tempesta fw]     mod_start(): sock_srv
[ 1396.632262] [tempesta fw]     Setup new group 'default' to use after reconfiguration
[ 1396.633315] [tempesta fw]   start server: 192.168.122.1:8000
[ 1396.642623] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.642625] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.642635] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.643392] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.644045] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.644741] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.645367] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.646071] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.646754] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.648137] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.649381] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.650077] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.650682] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.651343] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.651943] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.652599] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.653205] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.653847] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.654483] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.655111] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.655682] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.656303] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.656846] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.657443] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.658018] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.659727] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.660299] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.660837] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.661382] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.661918] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.662504] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.663042] [tempesta fw]   connected: 192.168.122.1:8000
[ 1396.663776] [tempesta fw]     Start scheduler 'ratio' for group 'default'
[ 1396.664838] [tempesta fw]     tfw_sched_ratio_add_grp_common: SG=[default]
[ 1396.674035] [tempesta fw]   Apply reconfig groups
[ 1396.694032] [tempesta fw]     Stop scheduler 'ratio' for group '__dfltopts__'
[ 1396.695100] [tempesta fw]     release group: '__dfltopts__'
[ 1396.695588] [tempesta fw]     mod_start(): sock_clnt
[ 1396.696253] BUG: kernel NULL pointer dereference, address: 0000000000000030
[ 1396.697281] #PF: supervisor read access in kernel mode
[ 1396.697717] #PF: error_code(0x0000) - not-present page
[ 1396.698172] PGD 0 P4D 0 
[ 1396.698484] Oops: 0000 [#1] SMP NOPTI
[ 1396.698856] CPU: 0 PID: 2331 Comm: sysctl Kdump: loaded Tainted: G    B   W  OE     5.10.35+ #1
[ 1396.699512] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 1396.700198] RIP: 0010:memcmp+0x16/0x30
[ 1396.700571] Code: eb 09 48 83 c0 01 48 39 c1 74 07 0f b6 10 39 f2 75 f0 c3 90 48 85 d2 74 1d 31 c9 eb 09 48 83 c1 01 48 39 ca 74 0f 0f b6 04 0f <44> 0f b6 04 0e 44 29 c0 74 e9 c3 c3 31 c0 c3 66 66 2e 0f 1f 84 00
[ 1396.702010] RSP: 0018:ffffa2a940dd7c60 EFLAGS: 00010246
[ 1396.702496] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 1396.703099] RDX: 0000000000000010 RSI: 0000000000000030 RDI: ffff94d8e5778d50
[ 1396.703703] RBP: ffffa2a940dd7c78 R08: 0000000000000000 R09: ffffffffc08fe740
[ 1396.704314] R10: ffffffffab06b740 R11: 0000000000000000 R12: ffff94d8e5778d20
[ 1396.704931] R13: ffff94d8e1fc3a98 R14: ffff94d8e1fc3a98 R15: ffffa2a940dd7cd0
[ 1396.705579] FS:  00007f2698cd2580(0000) GS:ffff94d8fbc00000(0000) knlGS:0000000000000000
[ 1396.706265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1396.706826] CR2: 0000000000000030 CR3: 00000001243c2006 CR4: 0000000000770ef0
[ 1396.707479] PKRU: 55555554
[ 1396.707885] Call Trace:
[ 1396.708298]  ? tfw_listen_socks_array_cmp+0x25/0x40 [tempesta_fw]
[ 1396.708914]  bsearch+0x63/0x90
[ 1396.709384]  ? tfw_sock_clnt_stop+0xa0/0xa0 [tempesta_fw]
[ 1396.709971]  tfw_sock_clnt_start+0x12e/0x340 [tempesta_fw]
[ 1396.710566]  ? tfw_cfgop_cleanup_srv_cfgs+0xaa/0xe0 [tempesta_fw]
[ 1396.711201]  ? tfw_sock_srv_start+0x2bd/0x380 [tempesta_fw]
[ 1396.711818]  tfw_ctlfn_state_io.cold+0x100/0x1d7 [tempesta_fw]
[ 1396.712503]  proc_sys_call_handler+0x13f/0x240
[ 1396.713423]  proc_sys_write+0x13/0x20
[ 1396.713963]  new_sync_write+0x117/0x1b0
[ 1396.714495]  vfs_write+0x185/0x250
[ 1396.715001]  ksys_write+0x67/0xe0
[ 1396.715505]  __x64_sys_write+0x1a/0x20
[ 1396.716047]  do_syscall_64+0x38/0x90
[ 1396.716575]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1396.717222] RIP: 0033:0x7f2698bed077
[ 1396.717769] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 1396.719545] RSP: 002b:00007fffc5661278 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 1396.720352] RAX: ffffffffffffffda RBX: 000055b55a897150 RCX: 00007f2698bed077
[ 1396.721157] RDX: 0000000000000006 RSI: 000055b55a897190 RDI: 0000000000000004
[ 1396.721955] RBP: 000055b55a8992c0 R08: 0000000000000000 R09: 0000000000000006
[ 1396.722742] R10: 000055b559dc816c R11: 0000000000000246 R12: 0000000000000006
[ 1396.723524] R13: 0000000000000006 R14: 00007f2698cc7a20 R15: 00007f2698cc78a0
[ 1396.724309] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) sha256_ssse3 sha512_ssse3 nls_utf8 isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common kvm_intel kvm ppdev input_leds joydev serio_raw mac_hid qemu_fw_cfg parport_pc parport sch_fq_codel ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel bochs_drm drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm aesni_intel crypto_simd cryptd glue_helper psmouse virtio_net net_failover failover i2c_piix4 pata_acpi floppy [last unloaded: tempesta_lib]
[ 1396.732219] CR2: 0000000000000030
kernel BUG at net/core/skbuff.c:4695!
[22071.268279] ------------[ cut here ]------------
[22071.270051] kernel BUG at net/core/skbuff.c:4695!
[22071.282467] invalid opcode: 0000 [#1] SMP NOPTI
[22071.284706] CPU: 1 PID: 20 Comm: ksoftirqd/1 Kdump: loaded Tainted: G    B   W  OE     5.10.35+ #1
[22071.286597] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[22071.288417] RIP: 0010:__skb_to_sgvec+0x260/0x270
[22071.290642] Code: 41 bd 01 00 00 00 e9 c1 fe ff ff 41 8d 55 ff 48 63 d2 48 c1 e2 05 41 f6 04 16 02 74 84 e9 d2 fe ff ff 85 db 0f 84 a2 fe ff ff <0f> 0b 0f 0b 4c 8b 0d c5 4f 0b 01 e9 f8 fe ff ff 0f 1f 44 00 00 55
[22071.292884] RSP: 0018:ffffb544c00b3790 EFLAGS: 00010206
[22071.295646] [tempesta fw]       tfw_match_ctext_vchar: str[0]=0x20 len=330 r=3
[22071.296678] RAX: 0000000000000076 RBX: 0000000000000009 RCX: 0000000000000001
[22071.297728] [tempesta fw]       tfw_match_ctext_vchar: str[0]=0x74 len=317 r=4
[22071.298534] RDX: ffff9d69b0f334c0 RSI: 00000000000002c0 RDI: ffff9d69b0f33000
[22071.300999] RBP: ffffb544c00b37c8 R08: 0000000000000000 R09: 0000000000000001
[22071.302843] R10: ffffd6d1c1d07800 R11: ffff9d69b0f33200 R12: 0000000000000076
[22071.302843] R13: 0000000000000002 R14: ffffb544c00b3890 R15: 0000000000000000
[22071.302844] FS:  0000000000000000(0000) GS:ffff9d6a7bd00000(0000) knlGS:0000000000000000
[22071.302845] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[22071.302846] CR2: 00007f0bac56bcf0 CR3: 00000001263ce004 CR4: 0000000000770ee0
[22071.305051] PKRU: 55555554
[22071.306920] Call Trace:
[22071.306929]  skb_to_sgvec+0x15/0x40
[22071.308493]  tfw_tls_encrypt+0x3b3/0x850 [tempesta_fw]
[22071.310721]  ? ip_finish_output2+0x1ab/0x590
[22071.313023]  ? __cgroup_bpf_run_filter_skb+0x3c3/0x3d0
[22071.315772]  ? ip_output+0x7a/0x100
[22071.315789]  ? lock_timer_base+0x72/0xa0
[22071.315792]  ? tcp_schedule_loss_probe+0x114/0x1b0
[22071.317341]  ? tcp_ack+0x10bf/0x1620
[22071.319558]  tcp_write_xmit+0x41d/0x1350
[22071.319561]  ? __kfree_skb+0x1b/0x20
[22071.319562]  __tcp_push_pending_frames+0x37/0x100
[22071.321698]  tcp_rcv_established+0x26c/0x670
[22071.323509]  ? sk_filter_trim_cap+0xde/0x240
[22071.323511]  tcp_v4_do_rcv+0x140/0x200
[22071.324971]  tcp_v4_rcv+0xcfd/0xe10
[22071.327097]  ip_protocol_deliver_rcu+0x30/0x1b0
[22071.327099]  ip_local_deliver_finish+0x48/0x60
[22071.329165]  ip_local_deliver+0x72/0x110
[22071.331686]  ? tcp_v4_early_demux+0xfa/0x160
[22071.331687]  ? ip_rcv_finish_core.isra.0+0x146/0x420
[22071.331688]  ip_rcv_finish+0x87/0xa0
[22071.331690]  ip_rcv+0xcc/0xe0
[22071.333093]  ? ip_rcv_finish_core.isra.0+0x420/0x420
[22071.335163]  __netif_receive_skb_one_core+0x88/0xa0
[22071.335164]  __netif_receive_skb+0x18/0x60
[22071.335165]  process_backlog+0xa9/0x160
[22071.336780]  net_rx_action+0x13e/0x390
[22071.338462]  ? rcu_core+0x3ff/0x500
[22071.339822]  __do_softirq+0xd9/0x291
[22071.341873]  run_ksoftirqd+0x2b/0x40
[22071.343835]  smpboot_thread_fn+0xd0/0x170
[22071.345531]  kthread+0x114/0x150
[22071.349113]  ? sort_range+0x30/0x30
[22071.349115]  ? kthread_park+0x90/0x90
[22071.349117]  ret_from_fork+0x1f/0x30
[22071.349118] Modules linked in:
[22071.350585]  tempesta_fw(OE)
[22071.352219]  tempesta_db(OE)
[22071.353931]  tempesta_tls(OE)
[22071.355854]  tempesta_lib(OE)
[22071.357822]  sha256_ssse3
[22071.359656]  sha512_ssse3
[22071.361492]  nls_utf8
[22071.364087]  isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common kvm_intel kvm ppdev parport_pc joydev input_leds parport mac_hid serio_raw qemu_fw_cfg sch_fq_codel
[22071.365545]  ipmi_devintf
[22071.367329]  ipmi_msghandler msr ip_tables x_tables autofs4 btrfs blake2b_generic
[22071.369158]  zstd_compress
[22071.371294]  raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c
[22071.372916]  raid1
[22071.374933]  raid0 multipath linear crct10dif_pclmul crc32_pclmul
[22071.376422]  ghash_clmulni_intel bochs_drm drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops aesni_intel cec drm crypto_simd cryptd glue_helper i2c_piix4 psmouse virtio_net net_failover failover pata_acpi floppy [last unloaded: tempesta_lib]
b3b commented 2 years ago

It turned out that special network conditions are required to reproduce crashes. Testing was conducted under high TCP fragmentations conditions. Fragmentation seems to have a big impact on Tempesta performance. When running in local testing mode, errors can be reproduced by setting the low MTU: ip link set lo mtu 80

b3b commented 2 years ago

Crash when running stress.test_stress.WrkStress with MTU = 80

kernel BUG at net/core/skbuff.c:111!
[  350.647003] [tempesta fw] Tempesta FW is ready
[  350.774350] [tempesta fw]   new client: cli=000000003ea793a3
[  350.774677] [tempesta fw]   client address: 127.0.0.1
[  350.774928] [tempesta fw]   client was found in tdb
[  350.778774] [tempesta fw]   client was found in tdb
[  350.779027] [tempesta fw]   client was found in tdb
[  350.780162] [tempesta fw]   client was found in tdb
[  350.780372] [tempesta fw]   client was found in tdb
[  350.781217] [tempesta fw]   client was found in tdb
[  350.781396] [tempesta fw]   client was found in tdb
[  350.782050] [tempesta fw]   client was found in tdb
[  350.782215] [tempesta fw]   client was found in tdb
[  350.782805] [tempesta fw]   client was found in tdb
[  350.782970] [tempesta fw]   client was found in tdb
[  350.783524] [tempesta fw]   client was found in tdb
[  350.783699] [tempesta fw]   client was found in tdb
[  350.784230] [tempesta fw]   client was found in tdb
[  350.784393] [tempesta fw]   client was found in tdb
[  350.784926] [tempesta fw]   client was found in tdb
[  350.785089] [tempesta fw]   client was found in tdb
[  350.785314] [tempesta fw]   client was found in tdb
[  350.785473] [tempesta fw]   client was found in tdb
[  350.787612] skbuff: skb_under_panic: text:ffffffff89212c3c len:64 put:20 head:ffff904deb8c1e00 data:ffff904deb8c1df0 tail:0x30 end:0xc0 dev:
[  350.787981] ------------[ cut here ]------------
[  350.788284] kernel BUG at net/core/skbuff.c:111!
[  350.788643] invalid opcode: 0000 [#1] SMP PTI
[  350.788950] CPU: 0 PID: 1692 Comm: python3 Kdump: loaded Tainted: G           OE     5.10.35+ #1
[  350.789282] Hardware name: Vultr VC2, BIOS  
[  350.789650] RIP: 0010:skb_panic+0x4c/0x4e
[  350.789982] Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50 ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 58 24 c7 89 e8 77 50 fb ff <0f> 0b 48 8b 55 08 48 c7 c1 00 ac 92 89 e8 a2 ff ff ff 0f 1f 44 00
[  350.790745] RSP: 0000:ffffafb800003cf0 EFLAGS: 00010246
[  350.791134] RAX: 0000000000000086 RBX: 0000000000004500 RCX: 0000000000000000
[  350.791537] RDX: 0000000000000000 RSI: ffff904e3bc18a40 RDI: ffff904e3bc18a40
[  350.791956] RBP: ffffafb800003d10 R08: ffff904e3bc18a40 R09: ffffafb800003ac8
[  350.792432] R10: 0000000000000001 R11: 0000000000000001 R12: ffff904deb8c1ce0
[  350.792892] R13: ffff904dc3209c80 R14: 0000000000000000 R15: ffffffff8a4d8b40
[  350.793300] FS:  00007f4fd58a5700(0000) GS:ffff904e3bc00000(0000) knlGS:0000000000000000
[  350.793768] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  350.794182] CR2: 00007f912f0667c0 CR3: 000000013091e005 CR4: 00000000005706f0
[  350.794643] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  350.795056] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  350.795466] PKRU: 55555554
[  350.795882] Call Trace:
[  350.796286]  
[  350.796705]  skb_push.cold+0x10/0x10
[  350.797127]  __ip_queue_xmit+0x6c/0x470
[  350.797586]  ? lock_timer_base+0x72/0xa0
[  350.797959] [tempesta fw]   client was found in tdb
[  350.798020]  ip_queue_xmit+0x15/0x20
[  350.798589] [tempesta fw]   client was found in tdb
[  350.799024]  __tcp_transmit_skb+0xa20/0xbe0
[  350.799026]  tcp_write_xmit+0x453/0x1350
[  350.799028]  __tcp_push_pending_frames+0x37/0x100
[  350.799030]  tcp_push+0xfc/0x100
[  350.799086]  ss_tx_action+0x2f7/0x5f0 [tempesta_fw]
[  350.799090]  net_tx_action+0x9c/0x250
[  350.800060]  __do_softirq+0xd9/0x291
[  350.800989]  asm_call_irq_on_stack+0x12/0x20
[  350.801959]  
[  350.803012]  do_softirq_own_stack+0x3d/0x50
[  350.803015]  irq_exit_rcu+0xa4/0xb0
[  350.803019]  sysvec_call_function_single+0x3d/0x90
[  350.803021]  ? asm_sysvec_call_function_single+0xa/0x20
[  350.804103]  asm_sysvec_call_function_single+0x12/0x20
[  350.805230] RIP: 0033:0x5e46a0
[  350.806389] Code: e6 47 0a 00 e9 a3 fe ff ff 90 f3 0f 1e fa 41 55 48 8d 47 05 41 54 55 53 48 89 fb 48 83 ec 08 48 3d 05 01 00 00 77 2b 8d 6f 05 <4c> 63 e5 49 c1 e4 05 49 81 c4 60 8d 95 00 49 83 04 24 01 48 83 c4
[  350.807596] RSP: 002b:00007f4fd58a4510 EFLAGS: 00000297
[  350.808752] RAX: 0000000000000014 RBX: 000000000000000f RCX: 000000008000000f
[  350.809899] RDX: 0000000000000001 RSI: 00007f4fd58a455e RDI: 000000000000000f
[  350.809900] RBP: 0000000000000014 R08: 0000000000000018 R09: 00007f4fd58c8a08
[  350.811150] R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000001
[  350.811151] R13: 0000000000000003 R14: 0000000000638cf0 R15: 00007f4fd7f340e0
[  350.811154] Modules linked in: tempesta_fw(OE) tempesta_db(OE) sha256_ssse3 sha512_ssse3 tempesta_tls(OE) tempesta_lib(OE) xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat br_netfilter
[  350.812355]  bridge
[  350.814181]  stp
[  350.815426]  llc
[  350.816630]  overlay
[  350.817852]  dm_multipath
[  350.819104]  scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common isst_if_common nfit rapl input_leds serio_raw snd_hda_codec_generic ledtrig_audio joydev snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore mac_hid
[  350.820410]  qemu_fw_cfg
[  350.822332]  nf_log_ipv6
[  350.823695]  ip6t_REJECT
[  350.825099]  nf_reject_ipv6
[  350.826498]  xt_hl
[  350.828010]  ip6t_rt
[  350.829716]  nf_log_ipv4
[  350.832086]  nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype tcp_bbr sch_fq xt_tcpudp efi_pstore xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ramoops reed_solomon ip6table_filter ip6_tables iptable_filter bpfilter ipmi_msghandler msr virtio_rng ip_tables x_tables autofs4 btrfs
[  350.833800]  blake2b_generic
[  350.835456]  zstd_compress
[  350.837156]  raid10
[  350.838889]  raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mpt3sas raid_class scsi_transport_sas megaraid_sas hid_generic
[  350.840580]  usbhid
[  350.842280]  hid
[  350.843980]  crct10dif_pclmul
[  350.846799]  crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper ahci libahci psmouse i2c_i801
[  350.848608]  i2c_smbus
[  350.850407]  lpc_ich
[  350.852170]  bochs_drm
[  350.854933]  drm_vram_helper virtio_net drm_ttm_helper ttm drm_kms_helper net_failover syscopyarea sysfillrect sysimgblt fb_sys_fops cec virtio_blk drm failover
RomanBelozerov commented 2 years ago

Tempesta fails with MTU 80 and concurrent connections. Test to reproduce. Request body does not affect fail.

DMESG ``` ------------[ cut here ]------------ [12463.168200] WARNING: CPU: 2 PID: 15701 at /root/tempesta/fw/http.c:6500 tfw_http_msg_process_generic+0x409/0x670 [tempesta_fw] [12463.168201] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) nfnetlink_queue nfnetlink_log nfnetlink cfg80211 sha256_ssse3 sha512_ssse3 rfcomm bnep vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock nls_iso8859_1 intel_rapl_msr intel_rapl_common crct10dif_pclmul ghash_clmulni_intel aesni_intel snd_ens1371 snd_ac97_codec crypto_simd gameport cryptd ac97_bus glue_helper snd_pcm btusb btrtl snd_seq_midi btbcm snd_seq_midi_event btintel snd_rawmidi bluetooth snd_seq snd_seq_device snd_timer joydev input_leds snd ecdh_generic ecc serio_raw vmw_balloon vmw_vmci soundcore sch_fq_codel vmwgfx mac_hid ttm drm_kms_helper cec rc_core fb_sys_fops syscopyarea sysfillrect sysimgblt ipmi_devintf ipmi_msghandler drm msr parport_pc ppdev lp parport pstore_blk ramoops reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid mptspi mptscsih mptbase psmouse ahci crc32_pclmul e1000 libahci scsi_transport_spi i2c_piix4 [12463.168255] pata_acpi [last unloaded: tempesta_lib] [12463.168259] CPU: 2 PID: 15701 Comm: curl Kdump: loaded Tainted: G W OE 5.10.35+ #1 [12463.168260] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [12463.168269] RIP: 0010:tfw_http_msg_process_generic+0x409/0x670 [tempesta_fw] [12463.168271] Code: 98 00 00 00 44 8b 95 70 ff ff ff a8 10 0f 84 72 fd ff ff e9 08 ff ff ff 4c 89 63 10 4d 89 24 24 4d 89 64 24 08 e9 6d fc ff ff <0f> 0b 41 ba ea ff ff ff e9 4e fd ff ff e8 45 89 ff ff 49 89 47 40 [12463.168272] RSP: 0018:ffffbc7640514a50 EFLAGS: 00010246 [12463.168274] RAX: 0000000000000000 RBX: ffffa04df1395280 RCX: 0000000000000000 [12463.168275] RDX: ffffa04e644d4c00 RSI: 0000000000000000 RDI: ffffa04df1395280 [12463.168276] RBP: ffffbc7640514af0 R08: ffffa04e644d4c00 R09: 000000000000000e [12463.168277] R10: ffffa04e644d4151 R11: 000000000000000e R12: 00000000ffffffff [12463.168278] R13: ffffa04df1395970 R14: 0000000000000001 R15: 0000000000000000 [12463.168279] FS: 00007feb2ec49f80(0000) GS:ffffa04ff5e80000(0000) knlGS:0000000000000000 [12463.168280] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [12463.168281] CR2: 00005616bcfb0e38 CR3: 00000000193bc006 CR4: 0000000000770ee0 [12463.168307] PKRU: 55555554 [12463.168308] Call Trace: [12463.168310] [12463.168319] ? ss_skb_process+0x7d/0x140 [tempesta_fw] [12463.168326] ? tfw_h2_conn_terminate_close+0x40/0x40 [tempesta_fw] [12463.168333] tfw_h2_frame_process+0x308/0x440 [tempesta_fw] [12463.168341] tfw_http_msg_process+0x42/0x50 [tempesta_fw] [12463.168347] tfw_connection_recv+0x59/0xa0 [tempesta_fw] [12463.168353] tfw_tls_connection_recv+0x19d/0x380 [tempesta_fw] [12463.168361] ss_tcp_process_data+0x1e7/0x3f0 [tempesta_fw] [12463.168369] ss_tcp_data_ready+0x44/0xb0 [tempesta_fw] [12463.168372] tcp_data_ready+0x2b/0xd0 [12463.168374] tcp_rcv_established+0x5a7/0x670 [12463.168376] tcp_v4_do_rcv+0x140/0x200 [12463.168377] tcp_v4_rcv+0xcfd/0xe10 [12463.168380] ip_protocol_deliver_rcu+0x30/0x1b0 [12463.168381] ip_local_deliver_finish+0x48/0x60 [12463.168383] ip_local_deliver+0x72/0x110 [12463.168390] ? tfw_ipv4_nf_hook+0xb9/0x130 [tempesta_fw] [12463.168391] ip_rcv_finish+0x87/0xa0 [12463.168393] ip_rcv+0xcc/0xe0 [12463.168394] ? ip_rcv_finish_core.isra.0+0x420/0x420 [12463.168397] __netif_receive_skb_one_core+0x88/0xa0 [12463.168398] __netif_receive_skb+0x18/0x60 [12463.168400] process_backlog+0xa9/0x160 [12463.168401] net_rx_action+0x13e/0x390 [12463.168405] __do_softirq+0xd9/0x291 [12463.168407] asm_call_irq_on_stack+0xf/0x20 [12463.168408] [12463.168411] do_softirq_own_stack+0x3d/0x50 [12463.168413] do_softirq.part.0+0x46/0x50 [12463.168415] __local_bh_enable_ip+0x50/0x60 [12463.168416] ip_finish_output2+0x1ab/0x590 [12463.168418] ip_do_fragment+0x1ee/0x510 [12463.168420] ? __ip_flush_pending_frames.isra.0+0x90/0x90 [12463.168421] ip_fragment.constprop.0+0x48/0x90 [12463.168423] __ip_finish_output+0x1d7/0x220 [12463.168424] ip_finish_output+0x2d/0xb0 [12463.168426] ip_output+0x7a/0x100 [12463.168427] ? ip_finish_output+0x2d/0xb0 [12463.168429] ip_local_out+0x3d/0x50 [12463.168430] __ip_queue_xmit+0x17a/0x470 [12463.168433] ? get_page_from_freelist+0xed8/0x1100 [12463.168435] ip_queue_xmit+0x15/0x20 [12463.168436] __tcp_transmit_skb+0xa94/0xc90 [12463.168439] tcp_write_xmit+0x453/0x1350 [12463.168441] __tcp_push_pending_frames+0x37/0x100 [12463.168442] tcp_push+0xfc/0x100 [12463.168444] tcp_sendmsg_locked+0xd36/0xe70 [12463.168446] tcp_sendmsg+0x2d/0x50 [12463.168448] inet_sendmsg+0x43/0x70 [12463.168451] sock_sendmsg+0x5e/0x70 [12463.168452] sock_write_iter+0x93/0xf0 [12463.168455] new_sync_write+0x192/0x1b0 [12463.168458] vfs_write+0x1ca/0x280 [12463.168460] ksys_write+0xb1/0xe0 [12463.168461] __x64_sys_write+0x1a/0x20 [12463.168463] do_syscall_64+0x38/0x90 [12463.168465] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [12463.168466] RIP: 0033:0x7feb3006c2f7 [12463.168468] Code: 75 05 48 83 c4 58 c3 e8 f7 33 ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [12463.168469] RSP: 002b:00007fff54679508 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [12463.168470] RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 00007feb3006c2f7 [12463.168471] RDX: 0000000000000064 RSI: 00005616bcfacd83 RDI: 0000000000000005 [12463.168472] RBP: 00005616bcf94ad0 R08: 0000000000000001 R09: 0000000000000008 [12463.168473] R10: 00005616bcfacd85 R11: 0000000000000246 R12: 00005616bcfacd83 [12463.168474] R13: 0000000000000064 R14: 00005616bcf92e20 R15: 0000000000000000 [12463.168476] ---[ end trace 50b1b7ebb7b42b44 ]--- [12463.201161] [tempesta fw] Warning: HTTP/2 request dropped: state=RGen_HdrOtherN input=0x0(''), len=1, off=0 [12463.201168] [tempesta fw] Warning: failed to parse request: 127.0.0.1 [12463.201178] ------------[ cut here ]------------ [12463.201199] WARNING: CPU: 2 PID: 15701 at /root/tempesta/fw/http_stream.c:89 tfw_h2_stream_fsm+0x12d/0x2b0 [tempesta_fw] [12463.201200] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) nfnetlink_queue nfnetlink_log nfnetlink cfg80211 sha256_ssse3 sha512_ssse3 rfcomm bnep vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock nls_iso8859_1 intel_rapl_msr intel_rapl_common crct10dif_pclmul ghash_clmulni_intel aesni_intel snd_ens1371 snd_ac97_codec crypto_simd gameport cryptd ac97_bus glue_helper snd_pcm btusb btrtl snd_seq_midi btbcm snd_seq_midi_event btintel snd_rawmidi bluetooth snd_seq snd_seq_device snd_timer joydev input_leds snd ecdh_generic ecc serio_raw vmw_balloon vmw_vmci soundcore sch_fq_codel vmwgfx mac_hid ttm drm_kms_helper cec rc_core fb_sys_fops syscopyarea sysfillrect sysimgblt ipmi_devintf ipmi_msghandler drm msr parport_pc ppdev lp parport pstore_blk ramoops reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid mptspi mptscsih mptbase psmouse ahci crc32_pclmul e1000 libahci scsi_transport_spi i2c_piix4 [12463.201302] pata_acpi [last unloaded: tempesta_lib] [12463.201307] CPU: 2 PID: 15701 Comm: curl Kdump: loaded Tainted: G W OE 5.10.35+ #1 [12463.201308] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [12463.201321] RIP: 0010:tfw_h2_stream_fsm+0x12d/0x2b0 [tempesta_fw] [12463.201323] Code: 89 a0 01 00 00 0f 0b 83 f8 06 74 6b 83 f8 07 75 f4 45 31 c0 41 80 fd 02 0f 84 56 ff ff ff 41 80 fd 09 0f 85 46 ff ff ff eb 8c <0f> 0b 41 b8 03 00 00 00 e9 3d ff ff ff 41 b8 03 00 00 00 e9 3c ff [12463.201325] RSP: 0018:ffffbc7640514530 EFLAGS: 00010293 [12463.201327] RAX: 0000000000000002 RBX: ffffa04dd4cc02c0 RCX: 0000000000000001 [12463.201328] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffa04dd4cc02f0 [12463.201329] RBP: ffffbc7640514560 R08: ffffbc764051457c R09: 0000000000000000 [12463.201330] R10: 0000000000000002 R11: 0000000000000000 R12: ffffa04dd4cc02f0 [12463.201331] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000001 [12463.201333] FS: 00007feb2ec49f80(0000) GS:ffffa04ff5e80000(0000) knlGS:0000000000000000 [12463.201334] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [12463.201335] CR2: 00005616bd03f080 CR3: 00000000193bc006 CR4: 0000000000770ee0 [12463.201345] PKRU: 55555554 [12463.201346] Call Trace: [12463.201348] [12463.201358] tfw_h2_stream_id_close+0xe7/0x110 [tempesta_fw] [12463.201367] tfw_h2_prep_resp+0x95/0x310 [tempesta_fw] [12463.201372] ? alloc_pages_current+0x87/0xe0 [12463.201381] tfw_h2_send_resp+0x5c/0xb0 [tempesta_fw] [12463.201388] tfw_h2_send_err_resp+0x9a/0xc0 [tempesta_fw] [12463.201399] tfw_http_cli_error_resp_and_log+0x3cd/0x440 [tempesta_fw] [12463.201407] ? tfw_http_parse_req+0xe190/0xe190 [tempesta_fw] [12463.201414] tfw_http_req_process+0xf6/0x960 [tempesta_fw] [12463.201421] ? __tfw_pool_new+0x2d/0x70 [tempesta_fw] [12463.201423] ? bzero_fast+0xe/0x10 [tempesta_lib] [12463.201425] ? bzero_fast+0xe/0x10 [tempesta_lib] [12463.201432] ? tfw_http_init_parser_req+0x22/0x50 [tempesta_fw] [12463.201439] ? __tfw_pool_new+0x2d/0x70 [tempesta_fw] [12463.201445] tfw_http_msg_process_generic+0x197/0x670 [tempesta_fw] [12463.201449] ? pg_skb_alloc+0x43b/0x480 [12463.201451] ? memcpy_fast+0xe/0x10 [tempesta_lib] [12463.201458] tfw_h2_frame_process+0x308/0x440 [tempesta_fw] [12463.201465] tfw_http_msg_process+0x42/0x50 [tempesta_fw] [12463.201472] tfw_connection_recv+0x59/0xa0 [tempesta_fw] [12463.201477] tfw_tls_connection_recv+0x19d/0x380 [tempesta_fw] [12463.201485] ss_tcp_process_data+0x1e7/0x3f0 [tempesta_fw] [12463.201492] ss_tcp_data_ready+0x44/0xb0 [tempesta_fw] [12463.201495] tcp_data_ready+0x2b/0xd0 [12463.201496] tcp_rcv_established+0x5a7/0x670 [12463.201499] tcp_v4_do_rcv+0x140/0x200 [12463.201500] tcp_v4_rcv+0xcfd/0xe10 [12463.201503] ip_protocol_deliver_rcu+0x30/0x1b0 [12463.201504] ip_local_deliver_finish+0x48/0x60 [12463.201505] ip_local_deliver+0x72/0x110 [12463.201512] ? tfw_ipv4_nf_hook+0xb9/0x130 [tempesta_fw] [12463.201514] ip_rcv_finish+0x87/0xa0 [12463.201515] ip_rcv+0xcc/0xe0 [12463.201517] ? ip_rcv_finish_core.isra.0+0x420/0x420 [12463.201519] __netif_receive_skb_one_core+0x88/0xa0 [12463.201521] __netif_receive_skb+0x18/0x60 [12463.201522] process_backlog+0xa9/0x160 [12463.201524] net_rx_action+0x13e/0x390 [12463.201527] __do_softirq+0xd9/0x291 [12463.201530] asm_call_irq_on_stack+0xf/0x20 [12463.201531] [12463.201534] do_softirq_own_stack+0x3d/0x50 [12463.201536] do_softirq.part.0+0x46/0x50 [12463.201538] __local_bh_enable_ip+0x50/0x60 [12463.201539] ip_finish_output2+0x1ab/0x590 [12463.201541] ip_do_fragment+0x1ee/0x510 [12463.201542] ? __ip_flush_pending_frames.isra.0+0x90/0x90 [12463.201544] ip_fragment.constprop.0+0x48/0x90 [12463.201545] __ip_finish_output+0x1d7/0x220 [12463.201547] ip_finish_output+0x2d/0xb0 [12463.201548] ip_output+0x7a/0x100 [12463.201550] ip_local_out+0x3d/0x50 [12463.201552] __ip_queue_xmit+0x17a/0x470 [12463.201553] ip_queue_xmit+0x15/0x20 [12463.201555] __tcp_transmit_skb+0xa94/0xc90 [12463.201557] tcp_write_xmit+0x453/0x1350 [12463.201559] __tcp_push_pending_frames+0x37/0x100 [12463.201560] tcp_rcv_established+0x337/0x670 [12463.201562] ? tcp_sendmsg_locked+0x26c/0xe70 [12463.201564] tcp_v4_do_rcv+0x140/0x200 [12463.201566] __release_sock+0x7b/0xc0 [12463.201567] release_sock+0x30/0xa0 [12463.201569] tcp_sendmsg+0x38/0x50 [12463.201570] inet_sendmsg+0x43/0x70 [12463.201572] sock_sendmsg+0x5e/0x70 [12463.201574] sock_write_iter+0x93/0xf0 [12463.201577] new_sync_write+0x192/0x1b0 [12463.201579] vfs_write+0x1ca/0x280 [12463.201580] ksys_write+0xb1/0xe0 [12463.201582] __x64_sys_write+0x1a/0x20 [12463.201584] do_syscall_64+0x38/0x90 [12463.201585] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [12463.201587] RIP: 0033:0x7feb3006c2f7 [12463.201589] Code: 75 05 48 83 c4 58 c3 e8 f7 33 ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [12463.201590] RSP: 002b:00007fff54679618 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [12463.201591] RAX: ffffffffffffffda RBX: 0000000000000426 RCX: 00007feb3006c2f7 [12463.201592] RDX: 0000000000000426 RSI: 00005616bd03ec93 RDI: 0000000000000009 [12463.201593] RBP: 00005616bd0e8810 R08: 0000000000000001 R09: 0000000000000008 [12463.201594] R10: 00005616bd03ec95 R11: 0000000000000246 R12: 00005616bd03ec93 [12463.201595] R13: 0000000000000426 R14: 00005616bd072460 R15: 0000000000000000 [12463.201597] ---[ end trace 50b1b7ebb7b42b45 ]--- ```
b3b commented 2 years ago

Related to the previous comment, HTTP/2 requests hangs forever with large output (sometimes) or when MTU 80 is set (reproducible). Can be reproduced with stress.test_stress.H2CurlStress and MTU 80.

curl -k -v https://127.0.0.1
*   Trying 127.0.0.1:443...
* TCP_NODELAY set
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connected to 127.0.0.1 (127.0.0.1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.3 (IN), TLS handshake, Server hello (2):
{ [100 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [620 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [148 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=Washington; L=Seattle; O=Tempesta Technologies Inc.; OU=Testing; CN=tempesta-tech.com; emailAddress=info@tempesta-tech.com
*  start date: Oct  5 13:57:29 2022 GMT
*  expire date: Oct  6 13:57:29 2023 GMT
*  issuer: C=US; ST=Washington; L=Seattle; O=Tempesta Technologies Inc.; OU=Testing; CN=tempesta-tech.com; emailAddress=info@tempesta-tech.com
*  SSL certificate verify result: self signed certificate (18), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
} [5 bytes data]
* Using Stream ID: 1 (easy handle 0x55e208a498c0)
} [5 bytes data]
> GET / HTTP/2
> Host: 127.0.0.1
> user-agent: curl/7.68.0
> accept: */*
> 
{ [5 bytes data]
* Connection state changed (MAX_CONCURRENT_STREAMS == 4294967295)!
} [5 bytes data]
  0     0    0     0    0     0      0      0 --:--:--  0:00:26 --:--:--     0

DMESG shows WARNING: CPU: 1 PID: 4234 at http.c:6500 tfw_http_msg_process_generic+0x409/0x670, as in https://github.com/tempesta-tech/tempesta/issues/1703#issuecomment-1268475093

b3b commented 1 year ago

Another way to reproduce crashes when running in local testing mode, is to run Tempesta and backend server on different network interfaces, which has different MTU. Can be reproduced with the test t_stress.test_wordpress : multiple network interfaces involved because backend server is running in Docker.

RomanBelozerov commented 1 year ago

I received kernel panic for t_stress.test_wordpress.H2WordpressStress.test_get_large_images test. Test config:

concurrent_connections = 10
stress_requests_count = 100
dmesg
[  407.944774] Start test: t_stress.test_wordpress.H2WordpressStress.test_get_large_images
[  408.201592] docker0: port 1(vetha62f49a) entered blocking state
[  408.202945] docker0: port 1(vetha62f49a) entered disabled state
[  408.204298] device vetha62f49a entered promiscuous mode
[  408.408844] eth0: renamed from veth8a7ecf1
[  408.432298] IPv6: ADDRCONF(NETDEV_CHANGE): vetha62f49a: link becomes ready
[  408.433951] docker0: port 1(vetha62f49a) entered blocking state
[  408.435279] docker0: port 1(vetha62f49a) entered forwarding state
[  408.607581] [tdb] Start Tempesta DB
[  408.634059] [tempesta fw] Initializing Tempesta FW kernel module...
[  408.797725] [tempesta fw] Configuration processing is completed.
[  408.828404] [tdb] Opened table /opt/tempesta/db/filter0.tdb: size=16777216 rec_size=20 base=000000007b6f539b
[  408.836938] [tdb] Opened table /opt/tempesta/db/sessions0.tdb: size=16777216 rec_size=312 base=00000000efb1c861
[  408.845214] [tdb] Opened table /opt/tempesta/db/client0.tdb: size=16777216 rec_size=624 base=00000000902fdb8e
[  408.935098] [tempesta fw] Open listen socket on: 0.0.0.0:443
[  409.006290] [tempesta fw] Tempesta FW is ready
[  415.627516] curl: page allocation failure: order:1, mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=user.slice,mems_allowed=0
[  415.628406] [tempesta fw] Warning: Cannot grow HTTP data string
[  415.630354] CPU: 5 PID: 5299 Comm: curl Tainted: G           OE     5.10.35+ #1
[  415.631764] [tempesta fw] Warning: response dropped: processing error: 127.0.0.1
[  415.633321] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[  415.633322] Call Trace:
[  415.633326]  
[  415.633334]  dump_stack+0x6b/0x83
[  415.633337]  warn_alloc.cold+0x7b/0xdf
[  415.633342]  __alloc_pages_slowpath.constprop.0+0xc29/0xc60
[  415.633344]  __alloc_pages_nodemask+0x2da/0x310
[  415.633363]  ss_skb_to_sgvec_with_new_pages+0x8b/0x280 [tempesta_fw]
[  415.633375]  tfw_tls_encrypt+0x351/0x720 [tempesta_fw]
[  415.633380]  ? netif_rx_internal+0xe0/0x100
[  415.633384]  ? ip_finish_output2+0x1ab/0x590
[  415.646578]  ? skb_split+0x62/0x2c0
[  415.647368]  tcp_write_xmit+0x3e5/0x1240
[  415.648277]  __tcp_push_pending_frames+0x32/0xf0
[  415.649310]  tcp_rcv_established+0x27d/0x680
[  415.650288]  tcp_v4_do_rcv+0x131/0x1f0
[  415.651131]  tcp_v4_rcv+0xc2f/0xd80
[  415.651915]  ip_protocol_deliver_rcu+0x2b/0x1b0
[  415.652956]  ip_local_deliver_finish+0x44/0x50
[  415.653937]  __netif_receive_skb_one_core+0x87/0xa0
[  415.655009]  process_backlog+0x96/0x160
[  415.655853]  net_rx_action+0x145/0x3e0
[  415.656701]  __do_softirq+0xcf/0x284
[  415.657509]  asm_call_irq_on_stack+0x12/0x20
[  415.658441]  
[  415.658915]  do_softirq_own_stack+0x37/0x40
[  415.659843]  do_softirq+0x5e/0x70
[  415.660609]  __local_bh_enable_ip+0x4b/0x50
[  415.661538]  ip_finish_output2+0x1ab/0x590
[  415.662439]  ? ipv4_link_failure+0x1b0/0x1b0
[  415.663367]  __ip_queue_xmit+0x180/0x410
[  415.664234]  __tcp_transmit_skb+0xa0e/0xbc0
[  415.665147]  tcp_recvmsg+0x275/0xae0
[  415.666045]  ? tcp_v4_do_rcv+0x131/0x1f0
[  415.666907]  ? _cond_resched+0x16/0x40
[  415.667740]  inet_recvmsg+0x5e/0x120
[  415.668528]  ? sock_recvmsg+0x1d/0x70
[  415.669338]  sock_read_iter+0x92/0xf0
[  415.670167]  new_sync_read+0x18a/0x1a0
[  415.671065]  vfs_read+0xf4/0x180
[  415.671783]  ksys_read+0xa7/0xe0
[  415.672515]  ? exit_to_user_mode_prepare+0x32/0x120
[  415.673605]  do_syscall_64+0x33/0x80
[  415.674438]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  415.675571] RIP: 0033:0x7f54206f704e
[  415.676393] Code: 0f 1f 40 00 48 8b 15 79 af 00 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb ba 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
[  415.680556] RSP: 002b:00007ffd1af6d778 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  415.682267] RAX: ffffffffffffffda RBX: 0000000000004018 RCX: 00007f54206f704e
[  415.683968] RDX: 0000000000004018 RSI: 0000562776d2c7d8 RDI: 0000000000000010
[  415.685551] RBP: 0000562776d2c7d8 R08: 0000000000000001 R09: 00007ffd1af6d8b8
[  415.687089] R10: 0000000000040000 R11: 0000000000000246 R12: 0000562776ca6e90
[  415.688633] R13: 0000000000000000 R14: 0000000000004018 R15: 0000562776ca7b20
[  415.690422] Mem-Info:
[  415.690977] active_anon:393 inactive_anon:527221 isolated_anon:0
[  415.690977]  active_file:141591 inactive_file:299055 isolated_file:0
[  415.690977]  unevictable:0 dirty:15719 writeback:0
[  415.690977]  slab_reclaimable:21888 slab_unreclaimable:24631
[  415.690977]  mapped:178848 shmem:8372 pagetables:7326 bounce:0
[  415.690977]  free:189917 free_pcp:2420 free_cma:0
[  415.698318] Node 0 active_anon:1572kB inactive_anon:2108884kB active_file:566364kB inactive_file:1196220kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:715392kB dirty:63048kB writeback:0kB shmem:33488kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 632832kB writeback_tmp:0kB kernel_stack:12896kB all_unreclaimable? yes
[  415.704864] Node 0 DMA free:15872kB min:132kB low:164kB high:196kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  415.710954] lowmem_reserve[]: 0 2952 7858 7858 7858
[  415.712098] Node 0 DMA32 free:710704kB min:82344kB low:88676kB high:95008kB reserved_highatomic:0KB active_anon:0kB inactive_anon:43532kB active_file:504kB inactive_file:5056kB unevictable:0kB writepending:32kB present:3129216kB managed:3063680kB mlocked:0kB pagetables:9400kB bounce:0kB free_pcp:1468kB local_pcp:964kB free_cma:0kB
[  415.718768] lowmem_reserve[]: 0 0 4905 4905 4905
[  415.719839] Node 0 Normal free:32352kB min:81020kB low:91544kB high:102068kB reserved_highatomic:47104KB active_anon:1572kB inactive_anon:2065360kB active_file:565860kB inactive_file:1191004kB unevictable:0kB writepending:63184kB present:5242880kB managed:5036796kB mlocked:0kB pagetables:19904kB bounce:0kB free_pcp:8504kB local_pcp:1388kB free_cma:0kB
[  415.726961] lowmem_reserve[]: 0 0 0 0 0
[  415.727842] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15872kB
[  415.730836] Node 0 DMA32: 177726*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 710904kB
[  415.733508] Node 0 Normal: 2776*4kB (UMH) 70*8kB (MEH) 86*16kB (MEH) 193*32kB (MEH) 118*64kB (MEH) 14*128kB (MEH) 3*256kB (E) 6*512kB (MH) 0*1024kB 0*2048kB 0*4096kB = 32400kB
[  415.737131] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[  415.739194] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  415.741088] 445689 total pagecache pages
[  415.742056] 0 pages in swap cache
[  415.742819] Swap cache stats: add 0, delete 0, find 0/0
[  415.744014] Free swap  = 998396kB
[  415.744758] Total swap = 998396kB
[  415.745626] 2097021 pages RAM
[  415.746328] 0 pages HighMem/MovableOnly
[  415.747185] 67926 pages reserved
[  415.747911] 0 pages hwpoisoned
[  415.756218] [tempesta fw] HTTP/2 connection is closed by client with error code: 6, ID of last processed stream: 0
[  415.758313] [tempesta fw] ERROR: cannot establish a new client connection
[  415.759595] ------------[ cut here ]------------
[  415.760533] kernel BUG at mm/slub.c:304!
[  415.761396] invalid opcode: 0000 [#1] SMP NOPTI
[  415.762430] CPU: 3 PID: 5299 Comm: curl Tainted: G           OE     5.10.35+ #1
[  415.764013] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[  415.766424] RIP: 0010:kmem_cache_free+0x3ea/0x410
[  415.767579] Code: 00 00 00 e9 73 fe ff ff 4c 8b 0c 24 48 8b 4c 24 08 48 89 da 48 89 ee 41 b8 01 00 00 00 4c 89 f7 e8 7b c4 ff ff e9 e6 fe ff ff <0f> 0b 48 8b 05 6d dc 37 01 e9 54 fd ff ff e8 83 55 62 00 48 8b 05
[  415.772090] RSP: 0018:ffffb7c500518cd8 EFLAGS: 00010246
[  415.773313] RAX: ffff967b4bfc2968 RBX: ffff967b4bfc2968 RCX: ffff967b4bfc2d90
[  415.774942] RDX: 0000000000000006 RSI: fffff0ff40000000 RDI: ffff967b4bfc2968
[  415.776572] RBP: fffff0ff482ff000 R08: 0000000000000000 R09: ffffb7c500518b60
[  415.778197] R10: ffffb7c500518b58 R11: ffff967b7fec4228 R12: ffff967b4bfc2968
[  415.779807] R13: 0000000000000000 R14: ffff967a69f50800 R15: fffff0ff482ff000
[  415.781941] FS:  00007f5420152b80(0000) GS:ffff967b75ec0000(0000) knlGS:0000000000000000
[  415.783374] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  415.784413] CR2: 00007f0c506e0390 CR3: 00000002130b8003 CR4: 0000000000370ee0
[  415.785690] Call Trace:
[  415.786180]  
[  415.786595]  ? tfw_sock_clnt_new+0x27b/0x2f0 [tempesta_fw]
[  415.787597]  tfw_sock_clnt_new+0x27b/0x2f0 [tempesta_fw]
[  415.788558]  ss_tcp_state_change+0xd4/0x220 [tempesta_fw]
[  415.789540]  tcp_rcv_state_process+0xb39/0xf30
[  415.790369]  ? tcp_check_req+0x187/0x550
[  415.791102]  tcp_child_process+0x94/0x180
[  415.791835]  tcp_v4_rcv+0x9b7/0xd80
[  415.792478]  ip_protocol_deliver_rcu+0x2b/0x1b0
[  415.793323]  ip_local_deliver_finish+0x44/0x50
[  415.794155]  __netif_receive_skb_one_core+0x87/0xa0
[  415.794987]  process_backlog+0x96/0x160
[  415.795647]  net_rx_action+0x145/0x3e0
[  415.796314]  ? try_to_wake_up+0x1d1/0x4f0
[  415.797006]  __do_softirq+0xcf/0x284
[  415.797624]  asm_call_irq_on_stack+0x12/0x20
[  415.798389]  
[  415.798779]  do_softirq_own_stack+0x37/0x40
[  415.799493]  do_softirq+0x5e/0x70
[  415.800076]  __local_bh_enable_ip+0x4b/0x50
[  415.800809]  ip_finish_output2+0x1ab/0x590
[  415.801495]  ? ipv4_link_failure+0x1b0/0x1b0
[  415.802266]  __ip_queue_xmit+0x180/0x410
[  415.802929]  ? lock_timer_base+0x61/0x80
[  415.803585]  __tcp_transmit_skb+0xa0e/0xbc0
[  415.804299]  tcp_rcv_state_process+0xca1/0xf30
[  415.805043]  tcp_v4_do_rcv+0xb8/0x1f0
[  415.805663]  __release_sock+0x73/0xb0
[  415.806317]  release_sock+0x2b/0x90
[  415.807098]  inet_stream_connect+0x43/0x50
[  415.808004]  __sys_connect+0x9b/0xd0
[  415.808795]  ? do_fcntl+0x1ff/0x6b0
[  415.809560]  __x64_sys_connect+0x16/0x20
[  415.810438]  do_syscall_64+0x33/0x80
[  415.811222]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  415.812314] RIP: 0033:0x7f54206f7223
[  415.813085] Code: 8b 15 a9 ad 00 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 0f 1f 44 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 18 89 54 24 0c 48
[  415.817109] RSP: 002b:00007ffd1af6d7e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
[  415.818787] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f54206f7223
[  415.820306] RDX: 0000000000000010 RSI: 00007ffd1af6d9a0 RDI: 0000000000000007
[  415.821845] RBP: 0000562776d6ecc0 R08: 000000000000019f R09: 00312e302e302e37
[  415.823404] R10: 00007ffd1afc8080 R11: 0000000000000246 R12: 0000562776d6d2d0
[  415.824929] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
[  415.826513] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) tcp_diag(E) inet_diag(E) sha256_ssse3(E) sha512_ssse3(E) sha512_generic(E) veth(E) uinput(E) xt_nat(E) xt_tcpudp(E) xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack_netlink(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) xfrm_user(E) xfrm_algo(E) nft_counter(E) xt_addrtype(E) nft_compat(E) nf_tables(E) libcrc32c(E) nfnetlink(E) br_netfilter(E) bridge(E) stp(E) llc(E) rfcomm(E) bnep(E) btusb(E) btrtl(E) btbcm(E) btintel(E) bluetooth(E) jitterentropy_rng(E) drbg(E) snd_ens1371(E) snd_ac97_codec(E) ac97_bus(E) gameport(E) snd_rawmidi(E) intel_rapl_msr(E) intel_rapl_common(E) vsock_loopback(E) intel_pmc_core(E) vmw_vsock_virtio_transport_common(E) aes_generic(E) snd_seq_device(E) ghash_clmulni_intel(E) overlay(E) aesni_intel(E) crypto_simd(E) snd_pcm(E) vmw_vsock_vmci_transport(E) cryptd(E) snd_timer(E) glue_helper(E) vmw_balloon(E) vsock(E) snd(E) ansi_cprng(E)
[  415.826558]  joydev(E) ecdh_generic(E) rfkill(E) ecc(E) libaes(E) sg(E) vmw_vmci(E) serio_raw(E) pcspkr(E) soundcore(E) evdev(E) ac(E) msr(E) parport_pc(E) ppdev(E) lp(E) parport(E) fuse(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc_t10dif(E) crct10dif_generic(E) hid_generic(E) usbhid(E) hid(E) sr_mod(E) cdrom(E) ata_generic(E) vmwgfx(E) ttm(E) crct10dif_pclmul(E) crct10dif_common(E) crc32_pclmul(E) drm_kms_helper(E) crc32c_intel(E) ata_piix(E) mptspi(E) cec(E) mptscsih(E) uhci_hcd(E) ehci_pci(E) mptbase(E) psmouse(E) ehci_hcd(E) libata(E) scsi_transport_spi(E) usbcore(E) drm(E) e1000(E) scsi_mod(E) i2c_piix4(E) button(E) [last unloaded: tempesta_lib]
[  415.859188] ---[ end trace 4df8dac09271f8cd ]---
[  415.860342] RIP: 0010:kmem_cache_free+0x3ea/0x410
[  415.861683] Code: 00 00 00 e9 73 fe ff ff 4c 8b 0c 24 48 8b 4c 24 08 48 89 da 48 89 ee 41 b8 01 00 00 00 4c 89 f7 e8 7b c4 ff ff e9 e6 fe ff ff <0f> 0b 48 8b 05 6d dc 37 01 e9 54 fd ff ff e8 83 55 62 00 48 8b 05
[  415.866251] RSP: 0018:ffffb7c500518cd8 EFLAGS: 00010246
[  415.867410] RAX: ffff967b4bfc2968 RBX: ffff967b4bfc2968 RCX: ffff967b4bfc2d90
[  415.868953] RDX: 0000000000000006 RSI: fffff0ff40000000 RDI: ffff967b4bfc2968
[  415.870496] RBP: fffff0ff482ff000 R08: 0000000000000000 R09: ffffb7c500518b60
[  415.872143] R10: ffffb7c500518b58 R11: ffff967b7fec4228 R12: ffff967b4bfc2968
[  415.873696] R13: 0000000000000000 R14: ffff967a69f50800 R15: fffff0ff482ff000
[  415.875340] FS:  00007f5420152b80(0000) GS:ffff967b75ec0000(0000) knlGS:0000000000000000
[  415.877129] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  415.878379] CR2: 00007f0c506e0390 CR3: 00000002130b8003 CR4: 0000000000370ee0
[  415.880028] Kernel panic - not syncing: Fatal exception in interrupt
[  416.067487] Kernel Offset: 0x39e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  416.075967] Rebooting in 1 seconds..
[  417.070387] ACPI MEMORY or I/O RESET_REG.
s0nx commented 1 year ago

Regarding this bug:

[  350.787612] skbuff: skb_under_panic: text:ffffffff89212c3c len:64 put:20 head:ffff904deb8c1e00 data:ffff904deb8c1df0 tail:0x30 end:0xc0 dev:
[  350.787981] ------------[ cut here ]------------
[  350.788284] kernel BUG at net/core/skbuff.c:111!

The problem occurs when we try to encode TCP header in the sk_buff which already carries encrypted payload. Here is the sk_buff dump before the encryption:

[  601.385906] [2] sk ffff888014acaf80 skb ffff8881278d4400: BEFORE ECNRYPT (AFTER FRAGMENT)
[  601.385906] skb len=56 headroom=36 headlen=56 tailroom=356
[  601.385907] mac=(-1,-1) net=(0,-1) trans=-1
[  601.385907] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[  601.385907] csum(0x0 ip_summed=3 complete_sw=0 valid=0 level=0)
[  601.385907] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
[  601.385907] dev name=(efault) feat=0x(efault)
[  601.385908] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00
[  601.385908] skb headroom: 00000010: 45 00 00 58 cb 6f 00 00 40 06 b1 2e 7f 00 00 01
[  601.385908] skb headroom: 00000020: 7f 00 00 01
[  601.385909] skb linear:   00000000: 00 00 a5 01 04 00 00 00 01 88 61 1d 54 68 75 2c
[  601.385909] skb linear:   00000010: 20 32 36 20 4a 61 6e 20 32 30 32 33 20 31 30 3a
[  601.385909] skb linear:   00000020: 34 37 3a 30 34 20 47 4d 54 5f 09 74 65 78 74 2f
[  601.385910] skb linear:   00000030: 68 74 6d 6c 5c 03 31 36
[  601.385910] skb tailroom: 00000000: 81 88 ff ff 58 4d 8d 27 81 88 ff ff 00 00 00 00
[  601.385911] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385911] skb tailroom: 00000020: 00 00 00 00 0d e6 c6 55 f5 9a 28 a9 00 00 00 00
[  601.385911] skb tailroom: 00000030: 00 00 00 00 c0 90 ac 28 81 88 ff ff 00 00 00 00
[  601.385911] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385912] skb tailroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385912] skb tailroom: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385912] skb tailroom: 00000070: 00 00 00 00 d0 4d 8d 27 81 88 ff ff d0 4d 8d 27
[  601.385913] skb tailroom: 00000080: 81 88 ff ff e0 4d 8d 27 81 88 ff ff e0 4d 8d 27
[  601.385913] skb tailroom: 00000090: 81 88 ff ff 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385913] skb tailroom: 000000a0: 00 00 00 00 30 45 d4 23 81 88 ff ff 00 00 00 00
[  601.385913] skb tailroom: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385914] skb tailroom: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385914] skb tailroom: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385914] skb tailroom: 000000e0: 00 00 00 00 02 88 00 00 03 00 00 00 00 00 00 00
[  601.385915] skb tailroom: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 58 4e 8d 27
[  601.385915] skb tailroom: 00000100: 81 88 ff ff 58 4e 8d 27 81 88 ff ff 00 00 00 00
[  601.385915] skb tailroom: 00000110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385915] skb tailroom: 00000120: 00 00 00 00 0d ac 4b 72 74 12 d4 56 00 00 00 00
[  601.385916] skb tailroom: 00000130: 00 00 00 00 c0 90 ac 28 81 88 ff ff 00 00 00 00
[  601.385916] skb tailroom: 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385917] skb tailroom: 00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.385917] skb tailroom: 00000160: 00 00 00 00
[  601.385945] [tempesta fw]       tfw_tls_encrypt: sk=ffff888014acaf80(snd_una=3240046344 snd_nxt=3240046344 limit=758)

It happens in __tcp_transmit_skb() when skb_push(skb, tcp_header_size) is called. For established connection we need 20 bytes for the actual header + 12 bytes for TCP options. By the time we get to __tcp_transmit_skb() we don't have enough headroom to accommodate the header, e.g.:

[  601.386116] skb len=69 headroom=23 headlen=69 tailroom=356
[  601.386117] mac=(-1,-1) net=(0,-1) trans=-1
[  601.386117] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[  601.386117] csum(0x0 ip_summed=3 complete_sw=0 valid=0 level=0)
[  601.386117] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
[  601.386118] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00
[  601.386118] skb headroom: 00000010: 45 00 00 58 cb 6f 00
[  601.386118] skb linear:   00000000: 17 03 03 01 72 00 00 00 00 00 00 00 15 49 94 85
[  601.386119] skb linear:   00000010: 59 41 82 1a 5a ea 26 4a 67 10 24 c9 ca a7 99 f8
[  601.386119] skb linear:   00000020: e2 d3 6b cb e9 b7 ce 2d 37 24 31 94 88 c4 4c fe
[  601.386119] skb linear:   00000030: 61 dc 51 b8 a9 f9 f2 03 ac 62 31 14 66 6f 5d 69
[  601.386119] skb linear:   00000040: 96 f4 4f 3f a9
[  601.386120] skb tailroom: 00000000: 81 88 ff ff 58 4d 8d 27 81 88 ff ff 00 00 00 00
[  601.386120] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386120] skb tailroom: 00000020: 00 00 00 00 0d e6 c6 55 f5 9a 28 a9 00 00 00 00
[  601.386121] skb tailroom: 00000030: 00 00 00 00 c0 90 ac 28 81 88 ff ff 00 00 00 00
[  601.386121] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386121] skb tailroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386121] skb tailroom: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386122] skb tailroom: 00000070: 00 00 00 00 d0 4d 8d 27 81 88 ff ff d0 4d 8d 27
[  601.386122] skb tailroom: 00000080: 81 88 ff ff e0 4d 8d 27 81 88 ff ff e0 4d 8d 27
[  601.386122] skb tailroom: 00000090: 81 88 ff ff 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386123] skb tailroom: 000000a0: 00 00 00 00 30 45 d4 23 81 88 ff ff 00 00 00 00
[  601.386123] skb tailroom: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386123] skb tailroom: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386123] skb tailroom: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386124] skb tailroom: 000000e0: 00 00 00 00 02 88 00 00 03 00 00 00 00 00 00 00
[  601.386124] skb tailroom: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 58 4e 8d 27
[  601.386124] skb tailroom: 00000100: 81 88 ff ff 58 4e 8d 27 81 88 ff ff 00 00 00 00
[  601.386125] skb tailroom: 00000110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386125] skb tailroom: 00000120: 00 00 00 00 0d ac 4b 72 74 12 d4 56 00 00 00 00
[  601.386126] skb tailroom: 00000130: 00 00 00 00 c0 90 ac 28 81 88 ff ff 00 00 00 00
[  601.386126] skb tailroom: 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386126] skb tailroom: 00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  601.386127] skb tailroom: 00000160: 00 00 00 00

Headroom has been decreased by 13 bytes after the encryption. We reserve 13 bytes (TLS header + IV) in ss_skb_expand_head_tail() inside tfw_tls_encrypt().

It's worth mentioning that the test uses lo iface with an MTU of 80.

After an investigation, I believe the problem is somewhere in http1.1 -> http2 message transformation code, e.g. tfw_h2_resp_adjust_fwd().

We do reuse sk_buffs which carry response from the backend and perform in-place http1.1 -> http2 transformation:

Parsing response from backend:
(skb ffff8881247b2800 was added to the msg)

[ 6162.583831] [tempesta fw]     Received 12 server data bytes on conn=00000000e54dc7d9 msg=00000000bd0572bd
[ 6162.583832] [tempesta fw]   parse 12 server data bytes (><title>404 ) on resp=00000000bd0572bd
[ 6162.583833] [tempesta fw]       enter FSM at state         pK-error
[ 6162.583834] [tempesta fw]       parser at RGen_BodyReadChunk: c=0x3e(>), p_off=0
[ 6162.583835] [tempesta fw]       store field chunk len=12 data=        pK-error(>) field=<        pK-error,0x0,13,        pK-error>
[ 6162.583836] [tempesta fw]     Response parsed: len=12 parsed=12 msg_len=180 ver=3 res=-1
[ 6162.583838] [tempesta fw]       GFSM move 0x3 -> 0x3: skb=        pK-error req=        pK-error resp=        pK-error
[ 6162.583839] [tempesta fw]       GFSM switch from fsm 0 at state 3 to fsm 7 at state 0xe00
[ 6162.583840] [tempesta fw]       GFSM exec fsm 7, state 0x2e00
[ 6162.583841] [tempesta fw]       GFSM move 0x2e00 -> 0xe1f: skb=        pK-error req=        pK-error resp=        pK-error
[ 6162.583842] [tempesta fw]       TFW_HTTP_FSM_RESP_CHUNK return code 0
[ 6162.583855] [tempesta fw]       [1]: ss_tcp_data_ready: sk=0000000078148f71 state=Established
[ 6162.583856] [tempesta fw]     Add skb ffff8881247b2800 to message 00000000bd0572bd
[ 6162.583858] skb len=24 headroom=68 headlen=24 tailroom=356
[ 6162.583858] mac=(2,14) net=(16,20) trans=36
[ 6162.583858] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[ 6162.583858] csum(0xd9e7d919 ip_summed=2 complete_sw=1 valid=1 level=0)
[ 6162.583858] hash(0xd5b285ba sw=0 l4=1) proto=0x0800 pkttype=0 iif=1
[ 6162.583860] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00
[ 6162.583861] skb headroom: 00000010: 45 00 00 58 3e 20 00 00 40 06 3e 7e 7f 00 00 01
[ 6162.583862] skb headroom: 00000020: 7f 00 00 01 1f 40 dc 0a ed a1 6f 10 5f 80 8b cd
[ 6162.583863] skb headroom: 00000030: 80 10 01 ff 3b 3b 00 00 01 01 08 0a 30 42 fe 6a
[ 6162.583864] skb headroom: 00000040: 30 42 fe 69
[ 6162.583865] skb linear:   00000000: 4e 6f 74 20 46 6f 75 6e 64 3c 2f 74 69 74 6c 65
[ 6162.583866] skb linear:   00000010: 3e 3c 2f 68 65 61 64 3e
[ 6162.583867] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583868] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583869] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583870] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583871] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583872] skb tailroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583873] skb tailroom: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583874] skb tailroom: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583875] skb tailroom: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583876] skb tailroom: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583877] skb tailroom: 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583879] skb tailroom: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583879] skb tailroom: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583880] skb tailroom: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583881] skb tailroom: 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583882] skb tailroom: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583883] skb tailroom: 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583884] skb tailroom: 00000110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583885] skb tailroom: 00000120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583887] skb tailroom: 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583888] skb tailroom: 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583889] skb tailroom: 00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.583890] skb tailroom: 00000160: 00 00 00 00

...

Adding skb ffff8881247b2800 to sk->sk_write_queue, which carries transformed http/2 response to client:

[ 6162.584523] [tempesta fw]       [1]: ss_do_send: entail sk=ffff88815e2c3900 [#5] skb=ffff8881247b2800 tsize=1024 mark=0 tls_type=17
[ 6162.584525] skb len=24 headroom=68 headlen=24 tailroom=356
[ 6162.584525] mac=(-1,-1) net=(0,-1) trans=-1
[ 6162.584525] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[ 6162.584525] csum(0x0 ip_summed=3 complete_sw=0 valid=0 level=0)
[ 6162.584525] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
[ 6162.584527] dev name=(efault) feat=0x(efault)
[ 6162.584528] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00
[ 6162.584529] skb headroom: 00000010: 45 00 00 58 3e 20 00 00 40 06 3e 7e 7f 00 00 01
[ 6162.584530] skb headroom: 00000020: 7f 00 00 01 1f 40 dc 0a ed a1 6f 10 5f 80 8b cd
[ 6162.584531] skb headroom: 00000030: 80 10 01 ff 3b 3b 00 00 01 01 08 0a 30 42 fe 6a
[ 6162.584532] skb headroom: 00000040: 30 42 fe 69
[ 6162.584533] skb linear:   00000000: 4e 6f 74 20 46 6f 75 6e 64 3c 2f 74 69 74 6c 65
[ 6162.584533] skb linear:   00000010: 3e 3c 2f 68 65 61 64 3e
[ 6162.584534] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584535] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584536] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584537] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584538] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584539] skb tailroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584540] skb tailroom: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584541] skb tailroom: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584542] skb tailroom: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584543] skb tailroom: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584544] skb tailroom: 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584546] skb tailroom: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584546] skb tailroom: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584547] skb tailroom: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584548] skb tailroom: 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584549] skb tailroom: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584550] skb tailroom: 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584551] skb tailroom: 00000110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584552] skb tailroom: 00000120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584553] skb tailroom: 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584554] skb tailroom: 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584555] skb tailroom: 00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6162.584556] skb tailroom: 00000160: 00 00 00 00

We might need to adjust this received skb somehow during response transformation.

s0nx commented 1 year ago

The issue was originally reproduced in a setup when both Tempesta and backend (nginx) are on the same machine and communication is being performed through loopback interface with MTU == 80.

I've tried to reproduce it with the backend on a different VM and there were no issues. Tempesta communicates with backend though the dedicated isolated network. The client (curl) communicates with Tempesta through the loopback.

     VM 1                                                 VM 2
 ┌────────────────────────────────────────┐       ┌───────────────────┐
 │  curl               Tempesta           │       │    Nginx          │
 │   │  ┌──────────────┐ │  │             │       │                   │
 │   └──┤ 'lo' (MTU 80)├─┘  │             │       │                   │
 │      └──────────────┘    │             │       │                   │
 │                     ┌────┴─────────────┤       ├───────────────────┤
 │                     │ eth0 (virtio-net)│       │  eth0 (virtio-net)│
 │                     │   MTU 80         │       │   MTU 80          │
 └─────────────────────┴───────────────┬──┘       └──┬────────────────┘
                                       │             │
                                   ┌───┴─────┬───────┴┐
                                   │ vnet0   │  vnet 1│
                                   │ MTU 80  │  MTU 80│
                                   ├──────────────────│
                                   │      virbr 1     │
                                   │       MTU 80     │
                                   └──────────────────┘
                                            HOST

The reason why we don't see a crash in this scenario is the way sk_buffs are allocated and reused when forming the reply to the client.

Just a quick recap of the data path when Nginx sends the reply (some irrelevant steps are omitted) and MTU is set to 80:

using loopback:

writev(...)                                       sendfile(...)
  do_writev()                                       . . .
    sock_write_iter()                                 sock_sendpage()
      sock_sendmsg()                                    kernel_sendpage()
        inet_sendmsg()                                    tcp_send_page()
          tcp_sendmsg()                                     tcp_sendpage_locked()
            tcp_sendmsg_locked()                              do_tcp_sendpages() ─┐
              tcp_push()     <────────────────────────────────────────────────────┘
                __tcp_push_pending_frames()
                  tcp_write_xmit()
                    tcp_transmit_skb()
                      __tcp_transmit_skb()
                        ip_queue_xmit()
                          __ip_queue_xmit()
                            ip_local_out()
                              dst_output()
                                __ip_finish_output()
                                      |
                                      V
  ┌─────────────────────────────────────────────────┐
  │ip_finish_output_gso():                          │
  ├─────────────────────────────────────────────────┤
  │ perform skb segmentation:                       │
  │     struct sk_buff *segs = skb_gso_segment()    │
  │                                  │              │
  │                                  V              │
  │                           __sbb_gso_segment()   │
  │                                  |              │
  │                           skb_max_gso_segment() │
  │                                   |             │
  │                           inet_gso_segment()    │
  │                                   │             │
  │                           ┌───────┴──────────┐  │
  │                           │tcp4_gso_segment()│  │
  │                           └──────────────────┘  │
  │ perform IP fragmentation:                       │
  │     for_each seg in (segs):                     │
  │           ip_fragment() -> ip_do_fragment()     │
  │                                   |             │
  │                           ┌───────────────────┐ │
  │                           │ip_finish_output2()│ │
  └───────────────────────────┴────────┬──────────┴─┘
         ┌─────────────────────────────┘
         V
  neigh_output() (include/net/neighbour.h)
    neigh_hh_output()
      neigh_direct_output() (net/core/neighbour.c)
        dev_queue_xmit()
          __dev_queue_xmit()
            dev_hard_start_xmit()
              xmit_one()
                netdev_start_xmit()                      sk_buff starts traversing
                  ops->ndo_start_xmit()                  the stack up from here
                    loopback_xmit()                              ^
                      netif_rx()  ───────────────────────────────┘

While sending out an sk_buff from tcp_sendmsg_locked() we can see that there is enough headroom (TCP_MAX_HEADER == 320) for all the headers. sk_buff allocation is happening inside the tcp_sendmsg_locked() via regular API:

. . .
            skb = sk_stream_alloc_skb(sk, 0, sk->sk_allocation,
                          first_skb);
. . .
sending the response via writev() (transport headers have not been encoded yet)

[39899.757009] [1] sk ffff88801a031a40 skb ffff888125b13000: TCP_SENDMSG_LOCKED
[39899.757009] skb len=236 headroom=320 headlen=0 tailroom=0
[39899.757009] mac=(-1,-1) net=(0,-1) trans=-1
[39899.757010] shinfo(txflags=0 nr_frags=1 gso(size=0 type=0 segs=0))
[39899.757010] csum(0x0 ip_summed=3 complete_sw=0 valid=0 level=0)
[39899.757011] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
[39899.757012] flist 0000000000000000 next ffff88801a031ba8 prev ffff88801a031ba8
[39899.757012] head ffff888125b13200 data ffff888125b13340
[39899.757012] tail ffff888125b13340  end 0x140
[39899.757013] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757013] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757015] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757015] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757016] skb headroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757016] skb headroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757017] skb headroom: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757018] skb headroom: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757018] skb headroom: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757019] skb headroom: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757019] skb headroom: 000000a0: 00 00 00 00 00 00 00 00 28 aa d0 c8 28 aa d0 c8  ........(...(...
[39899.757020] skb headroom: 000000b0: 00 00 00 00 10 00 00 00 89 5e c0 0e 00 00 00 00  .........^......
[39899.757020] skb headroom: 000000c0: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757021] skb headroom: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757021] skb headroom: 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757022] skb headroom: 000000f0: 20 00 00 00 00 00 00 00 0e 00 00 00 00 00 60 00   .............`.
[39899.757022] skb headroom: 00000100: e0 81 00 00 00 00 00 00 20 01 10 00 00 00 00 00  ........ .......
[39899.757023] skb headroom: 00000110: 01 00 00 00 bf 10 6d 4b 00 00 00 00 02 00 00 00  ......mK........
[39899.757023] skb headroom: 00000120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757024] skb headroom: 00000130: 08 00 20 01 0c 01 fe 00 40 01 00 00 40 01 00 00  .. .....@...@...
[39899.757024] frag # 0 >>
[39899.757025] skb frag pg: -00000000: 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d  HTTP/1.1 200 OK.
[39899.757025] skb frag pg: -00000010: 0a 53 65 72 76 65 72 3a 20 6e 67 69 6e 78 2f 31  .Server: nginx/1
[39899.757026] skb frag pg: -00000020: 2e 32 32 2e 30 0d 0a 44 61 74 65 3a 20 54 68 75  .22.0..Date: Thu
[39899.757027] skb frag pg: -00000030: 2c 20 30 39 20 46 65 62 20 32 30 32 33 20 30 39  , 09 Feb 2023 09
[39899.757027] skb frag pg: -00000040: 3a 30 33 3a 31 32 20 47 4d 54 0d 0a 43 6f 6e 74  :03:12 GMT..Cont
[39899.757028] skb frag pg: -00000050: 65 6e 74 2d 54 79 70 65 3a 20 74 65 78 74 2f 68  ent-Type: text/h
[39899.757028] skb frag pg: -00000060: 74 6d 6c 0d 0a 43 6f 6e 74 65 6e 74 2d 4c 65 6e  tml..Content-Len
[39899.757029] skb frag pg: -00000070: 67 74 68 3a 20 36 34 0d 0a 4c 61 73 74 2d 4d 6f  gth: 64..Last-Mo
[39899.757029] skb frag pg: -00000080: 64 69 66 69 65 64 3a 20 57 65 64 2c 20 30 38 20  dified: Wed, 08
[39899.757030] skb frag pg: -00000090: 46 65 62 20 32 30 32 33 20 32 32 3a 33 34 3a 30  Feb 2023 22:34:0
[39899.757030] skb frag pg: -000000a0: 37 20 47 4d 54 0d 0a 43 6f 6e 6e 65 63 74 69 6f  7 GMT..Connectio
[39899.757031] skb frag pg: -000000b0: 6e 3a 20 6b 65 65 70 2d 61 6c 69 76 65 0d 0a 45  n: keep-alive..E
[39899.757031] skb frag pg: -000000c0: 54 61 67 3a 20 22 36 33 65 34 32 33 35 66 2d 34  Tag: "63e4235f-4
[39899.757032] skb frag pg: -000000d0: 30 22 0d 0a 41 63 63 65 70 74 2d 52 61 6e 67 65  0"..Accept-Range
[39899.757032] skb frag pg: -000000e0: 73 3a 20 62 79 74 65 73 0d 0a 0d 0a              s: bytes....

sending response body via sendfile()

[41942.037112] [1] sk ffff88801a034ec0 skb ffff88807389a380: DO_TCP_SENDPAGES
[41942.037112] skb len=84 headroom=320 headlen=0 tailroom=0
[41942.037113] mac=(-1,-1) net=(0,-1) trans=-1
[41942.037113] shinfo(txflags=32 nr_frags=2 gso(size=0 type=0 segs=0))
[41942.037113] csum(0x0 ip_summed=3 complete_sw=0 valid=0 level=0)
[41942.037114] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
[41942.037114] flist 0000000000000000 next ffff88801a035028 prev ffff88801a035028
[41942.037114] head ffff88807389a580 data ffff88807389a6c0
[41942.037114] tail ffff88807389a6c0  end 0x140
[41942.037115] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037115] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037115] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037115] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037116] skb headroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037116] skb headroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037116] skb headroom: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037116] skb headroom: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037117] skb headroom: 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037117] skb headroom: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037117] skb headroom: 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037117] skb headroom: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037118] skb headroom: 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037118] skb headroom: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037118] skb headroom: 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037118] skb headroom: 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037119] skb headroom: 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037119] skb headroom: 00000110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037119] skb headroom: 00000120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037119] skb headroom: 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[41942.037120] frag # 0 >>
[41942.037120] skb frag pg: -00000000: 70 74 2d 52 61 6e 67 65 73 3a 20 62 79 74 65 73  pt-Ranges: bytes
[41942.037120] skb frag pg: -00000010: 0d 0a 0d 0a                                      ....
[41942.037120] frag # 1 >>
[41942.037120] skb frag pg: -00000000: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61  aaaaaaaaaaaaaaaa
[41942.037121] skb frag pg: -00000010: 61 61 61 61 61 61 61 61 61 61 61 61 61 62 62 62  aaaaaaaaaaaaabbb
[41942.037121] skb frag pg: -00000020: 62 62 62 62 62 62 62 62 62 62 62 62 62 62 62 62  bbbbbbbbbbbbbbbb
[41942.037121] skb frag pg: -00000030: 62 62 62 62 62 62 62 62 62 62 62 62 62 62 62 0a  bbbbbbbbbbbbbbb.

By the time an SKB would make it to netif_rx() it would be transformed into multiple SKBs due to the low MTU, which results in small MSS value.

There are 3 main stages of packet segmentation along the way:

  1. tcp_write_xmit() Here we traverse sk->sk_write_queue and trying to send as much data as possible with respect to the available window size of the receiver and congestion window rules.

    We initially set tso_segs according to the current MSS, then we check whether congestion window rules allow us to send at least one segment (tcp_cwnd_test()). Inside tcp_snd_wnd_test() we do want to know if there is enough space is send window to send a segment.

    If segmentation is required (tso_segs > 1), tcp_mss_split_point() will tell us the exact amount of payload that would fit 'inside the window'. This payload might be composed of several MSS sized segments. We also account for TLS overhead (34 bytes) later and bail out if there is not enough space for the encrypted payload + TLS header. The resulting limit is used chop SKB payload into multiple chunks:

    if (skb->len > limit &&
       unlikely(tso_fragment(sk, skb, limit, mss_now, gfp)))
          break;

    tso_fragment() / tcp_fragment() do not alter the headroom in any way at this stage. They only work with linear data (skb->data) and fragments, e.g.:

    -> limit = mss_now;
    [419594.028060] [1] SECOND LIMIT sk ffff888125a98980 skb ffff8881263f6380 limit 36
    -> limit -= TLS_MAX_HEADER
    [419594.028068] [1] AFTER ADJUST sk ffff888125a98980 skb ffff8881263f6380 limit 2
    
    [419594.028068] [1] sk ffff888125a98980 skb ffff8881263f6380: BEFORE FRAGMENT
    [419594.028068] skb len=9 headroom=320 headlen=9 tailroom=119
    [419594.028069] mac=(-1,-1) net=(0,-1) trans=-1
    [419594.028069] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
    [419594.028069] csum(0x0 ip_summed=3 complete_sw=0 valid=0 level=0)
    [419594.028069] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
    [419594.028069] flist 0000000000000000 next ffff888125a98ae8 prev ffff888125a98ae8
    [419594.028070] head ffff8881263f6480 data ffff8881263f65c0
    [419594.028070] tail ffff8881263f65c9  end 0x1c0
    [419594.028071] dev name=(efault) feat=0x(efault)
    [419594.028071] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028071] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028071] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028072] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028072] skb headroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028072] skb headroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028072] skb headroom: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028072] skb headroom: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028073] skb headroom: 00000080: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028073] skb headroom: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028073] skb headroom: 000000a0: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
    [419594.028073] skb headroom: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028074] skb headroom: 000000c0: 00 90 9c b7 81 88 ff ff 3f 00 00 00 00 00 00 00  ........?.......
    [419594.028074] skb headroom: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028074] skb headroom: 000000e0: 00 00 00 00 00 00 00 00 03 74 f1 6c 81 88 ff ff  .........t.l....
    [419594.028074] skb headroom: 000000f0: 38 00 00 00 00 00 00 00 0e 00 00 00 00 00 61 00  8.............a.
    [419594.028075] skb headroom: 00000100: 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028075] skb headroom: 00000110: 01 00 00 00 41 0f 0f e9 00 00 00 00 02 00 00 00  ....A...........
    [419594.028075] skb headroom: 00000120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028075] skb headroom: 00000130: 08 00 24 00 10 00 02 00 5c 00 00 00 c0 00 00 00  ..$.....\.......
    [419594.028075] skb linear:   00000000: 00 00 00 04 01 00 00 00 00                       .........
    [419594.028076] skb tailroom: 00000000: 66 3f 26 81 88 ff ff 00 03 00 00 01 00 00 00 00  f?&.............
    [419594.028076] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028076] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028076] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028077] skb tailroom: 00000040: 00 00 00 00 00 08 00 45 00 00 53 18 1a 00 00 40  .......E..S....@
    [419594.028077] skb tailroom: 00000050: 06 64 89 7f 00 00 01 7f 00 00 01 01 bb 8a 4e 5d  .d............N]
    [419594.028077] skb tailroom: 00000060: 7d d4 54 e7 f4 22 13 80 10 01 ff 29 5f 00 00 01  }.T..".....)_...
    [419594.028077] skb tailroom: 00000070: 01 08 0a 32 19 8a 59                             ...2..Y
    
    [419594.028103] [1] sk ffff888125a98980 skb ffff8881263f6380: BEFORE ENCRYPT
    [419594.028103] skb len=2 headroom=320 headlen=2 tailroom=126
    [419594.028103] mac=(-1,-1) net=(0,-1) trans=-1
    [419594.028103] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
    [419594.028104] csum(0x0 ip_summed=3 complete_sw=0 valid=0 level=0)
    [419594.028104] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
    [419594.028104] flist 0000000000000000 next ffff88816cfcf000 prev ffff888125a98ae8
    [419594.028104] head ffff8881263f6480 data ffff8881263f65c0
    [419594.028104] tail ffff8881263f65c2  end 0x1c0
    [419594.028105] dev name=(efault) feat=0x(efault)
    [419594.028105] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028105] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028105] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028106] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028106] skb headroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028106] skb headroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028106] skb headroom: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028107] skb headroom: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028107] skb headroom: 00000080: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028107] skb headroom: 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028107] skb headroom: 000000a0: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
    [419594.028107] skb headroom: 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028108] skb headroom: 000000c0: 00 90 9c b7 81 88 ff ff 3f 00 00 00 00 00 00 00  ........?.......
    [419594.028108] skb headroom: 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028108] skb headroom: 000000e0: 00 00 00 00 00 00 00 00 03 74 f1 6c 81 88 ff ff  .........t.l....
    [419594.028108] skb headroom: 000000f0: 38 00 00 00 00 00 00 00 0e 00 00 00 00 00 61 00  8.............a.
    [419594.028109] skb headroom: 00000100: 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028109] skb headroom: 00000110: 01 00 00 00 41 0f 0f e9 00 00 00 00 02 00 00 00  ....A...........
    [419594.028109] skb headroom: 00000120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028109] skb headroom: 00000130: 08 00 24 00 10 00 02 00 5c 00 00 00 c0 00 00 00  ..$.....\.......
    [419594.028109] skb linear:   00000000: 00 00                                            ..
    [419594.028110] skb tailroom: 00000000: 00 04 01 00 00 00 00 66 3f 26 81 88 ff ff 00 03  .......f?&......
    [419594.028110] skb tailroom: 00000010: 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028110] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028110] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    [419594.028110] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 08 00 45 00  ..............E.
    [419594.028111] skb tailroom: 00000050: 00 53 18 1a 00 00 40 06 64 89 7f 00 00 01 7f 00  .S....@.d.......
    [419594.028111] skb tailroom: 00000060: 00 01 01 bb 8a 4e 5d 7d d4 54 e7 f4 22 13 80 10  .....N]}.T.."...
    [419594.028111] skb tailroom: 00000070: 01 ff 29 5f 00 00 01 01 08 0a 32 19 8a 59        ..)_......2..Y

    After this split we do perform encryption sk->sk_write_xmit() == tfw_tls_encrypt() on original SKB with the limit == 2, so only 2 bytes of encrypted payload (+ TLS headers) would reside in this skb.

  2. tcp4_gso_segment() -> tcp_gso_segment() -> skb_segment() Perform TCP header fragmentation. skb_segment() is quite complicated and not much of an interest for us right now.

  3. ip_fragment() -> ip_do_fragment() Perform IP header fragmentation

Due to the reason mentioned above, some of the SKBs do not have a complete TCP header in them.

SKBs on netif_rx():

segment #1:
14b - ETH hdr
20b - IPv4 hdr
32b - TCP hdr + TCP opts
24b - part of http/1 response

[39899.757123] [1] sk ffff88801a031a40 skb ffff8881267b0b80: IN LOOPBACK_XMIT
[39899.757123] skb len=90 headroom=2 headlen=90 tailroom=100
[39899.757123] mac=(2,14) net=(16,20) trans=36
[39899.757124] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[39899.757124] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[39899.757125] hash(0x4b6d10bf sw=0 l4=1) proto=0x0800 pkttype=0 iif=0
[39899.757125] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[39899.757126] head ffff8881267b0c80 data ffff8881267b0c82
[39899.757126] tail ffff8881267b0cdc  end 0xc0
[39899.757126] dev name=lo feat=0x0x00000510401d7c69
[39899.757127] sk family=2 type=1 proto=6
[39899.757127] skb headroom: 00000000: 00 00                                            ..
[39899.757128] skb linear:   00000000: 00 00 00 00 00 00 00 00 00 00 00 00 08 00 45 00  ..............E.
[39899.757128] skb linear:   00000010: 00 4c ac 43 20 00 40 06 b0 66 7f 00 00 01 7f 00  .L.C .@..f......
[39899.757129] skb linear:   00000020: 00 01 1f 40 97 04 c8 d0 aa 28 0e c0 5e 8f 80 10  ...@.....(..^...
[39899.757129] skb linear:   00000030: 01 ff 87 a3 00 00 01 01 08 0a ba 9d 27 90 ba 9d  ............'...
[39899.757130] skb linear:   00000040: 27 8f|48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f  '.HTTP/1.1 200 O
[39899.757130] skb linear:   00000050: 4b 0d 0a 53 65 72 76 65 72 3a|                   K..Server:
[39899.757131] skb tailroom: 00000000: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757132] skb tailroom: 00000010: 00 00 00 00 00 9a dc 06 00 ea ff ff 04 00 00 00  ................
[39899.757132] skb tailroom: 00000020: 8c 72 00 00 00 00 00 00 00 00 00 00 00 00 00 00  .r..............
[39899.757133] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757133] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757134] skb tailroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757135] skb tailroom: 00000060: 00 00 00 00                                      ....

segment #2

[39899.757185] [1] sk ffff88801a031a40 skb ffff8881267b0380: IN LOOPBACK_XMIT
[39899.757185] skb len=46 headroom=2 headlen=46 tailroom=16
[39899.757185] mac=(2,14) net=(16,20) trans=36
[39899.757186] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[39899.757186] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[39899.757187] hash(0x4b6d10bf sw=0 l4=1) proto=0x0800 pkttype=0 iif=0
[39899.757187] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[39899.757187] head ffff8881267b0480 data ffff8881267b0482
[39899.757188] tail ffff8881267b04b0  end 0x40
[39899.757188] dev name=lo feat=0x0x00000510401d7c69
[39899.757188] sk family=2 type=1 proto=6
[39899.757189] skb headroom: 00000000: 00 00                                            ..
[39899.757189] skb linear:   00000000: 00 00 00 00 00 00 00 00 00 00 00 00 08 00 45 00  ..............E.
[39899.757190] skb linear:   00000010: 00 20 ac 43 00 07 40 06 d0 8b 7f 00 00 01 7f 00  . .C..@.........
[39899.757190] skb linear:   00000020: 00 01 20 6e 67 69 6e 78 2f 31 2e 32 32 2e        .. nginx/1.22.
[39899.757191] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

segment #3

[39899.757341] [1] sk ffff88801a031a40 skb ffff8881254f5000: IN LOOPBACK_XMIT
[39899.757342] skb len=90 headroom=2 headlen=90 tailroom=100
[39899.757342] mac=(2,14) net=(16,20) trans=36
[39899.757342] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[39899.757343] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[39899.757343] hash(0x4b6d10bf sw=0 l4=1) proto=0x0800 pkttype=0 iif=0
[39899.757343] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[39899.757344] head ffff8881254f5100 data ffff8881254f5102
[39899.757344] tail ffff8881254f515c  end 0xc0
[39899.757344] dev name=lo feat=0x0x00000510401d7c69
[39899.757344] sk family=2 type=1 proto=6
[39899.757345] skb headroom: 00000000: 00 00                                            ..
[39899.757346] skb linear:   00000000: 00 00 00 00 00 00 00 00 00 00 00 00 08 00 45 00  ..............E.
[39899.757346] skb linear:   00000010: 00 4c ac 44 20 00 40 06 b0 65 7f 00 00 01 7f 00  .L.D .@..e......
[39899.757347] skb linear:   00000020: 00 01 1f 40 97 04 c8 d0 aa 4c 0e c0 5e 8f 80 10  ...@.....L..^...
[39899.757347] skb linear:   00000030: 01 ff 3d 1f 00 00 01 01 08 0a ba 9d 27 90 ba 9d  ..=.........'...
[39899.757348] skb linear:   00000040: 27 8f 30 0d 0a 44 61 74 65 3a 20 54 68 75 2c 20  '.0..Date: Thu,
[39899.757348] skb linear:   00000050: 30 39 20 46 65 62 20 32 30 32                    09 Feb 202
[39899.757349] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757349] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757349] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757350] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757350] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757351] skb tailroom: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.757351] skb tailroom: 00000060: 00 00 00 00                                      ....

. . .

last segment met in netif_rx() (partially contains the body)

[39899.758525] [1] sk ffff88801a031a40 skb ffff888125b13380: IN LOOPBACK_XMIT
[39899.758525] skb len=90 headroom=2 headlen=90 tailroom=100
[39899.758525] mac=(2,14) net=(16,20) trans=36
[39899.758526] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[39899.758526] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[39899.758527] hash(0x4b6d10bf sw=0 l4=1) proto=0x0800 pkttype=0 iif=0
[39899.758527] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[39899.758528] head ffff888125b13480 data ffff888125b13482
[39899.758528] tail ffff888125b134dc  end 0xc0
[39899.758529] dev name=lo feat=0x0x00000510401d7c69
[39899.758529] sk family=2 type=1 proto=6
[39899.758529] skb headroom: 00000000: 00 00                                            ..
[39899.758530] skb linear:   00000000: 00 00 00 00 00 00 00 00 00 00 00 00 08 00 45 00  ..............E.
[39899.758530] skb linear:   00000010: 00 4c ac 4f 20 00 40 06 b0 5a 7f 00 00 01 7f 00  .L.O .@..Z......
[39899.758530] skb linear:   00000020: 00 01 1f 40 97 04 c8 d0 ab 00 0e c0 5e 8f 80 10  ...@........^...
[39899.758531] skb linear:   00000030: 01 ff 1c 96 00 00 01 01 08 0a ba 9d 27 92 ba 9d  ............'...
[39899.758531] skb linear:   00000040: 27 91 70 74 2d 52 61 6e 67 65 73 3a 20 62 79 74  '.pt-Ranges: byt
[39899.758531] skb linear:   00000050: 65 73 0d 0a 0d 0a 61 61 61 61                    es....aaaa
[39899.758532] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.758532] skb tailroom: 00000010: 00 00 00 00 24 00 00 00 24 00 00 00 00 00 00 00  ....$...$.......
[39899.758533] skb tailroom: 00000020: 00 00 67 00 60 00 00 00 00 00 00 00 00 00 00 00  ..g.`...........
[39899.758533] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.758534] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[39899.758534] skb tailroom: 00000050: 00 00 00 00 00 00 ff ff 00 00 ff ff 40 01 00 00  ............@...
[39899.758534] skb tailroom: 00000060: 40 01 00 00                                      @...

As i understand it, incoming packets on the loopback iface avoid GRO engine completely:

netif_rx(skb)
  netif_rx_internal(...)
    enqueue_to_backlog()
      ____napi_schedule()

If we would examine the receive queue of the client socket (sk->sk_receive_queue) ontcp_data_ready() call, we would see the following:

RX data path(processing SKBs from backlog):

→ net_rx_action()
  → napi_poll()
    → process_backlog()
      → __netif_receive_skb_one_core()
        → ip_local_deliver_finish()
          → ip_protocol_deliver_rcu()
            → tcp_v4_rcv()
              → tcp_v4_do_rcv()
                → tcp_rcv_established()
                  → tcp_data_ready()
sk->sk_receive_queue contents:

[39899.758660] [1] #0 sk ffff88801a035780 skb ffff8881267b0380: RECV QUEUE
[39899.758661] skb len=36 headroom=68 headlen=24 tailroom=0
[39899.758661] mac=(2,14) net=(16,20) trans=36
[39899.758661] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[39899.758662] csum(0x19989969 ip_summed=2 complete_sw=1 valid=1 level=0)
[39899.758662] hash(0x4b6d10bf sw=0 l4=1) proto=0x0800 pkttype=0 iif=1
[39899.758662] flist ffff8881b70ae000 next ffff8881254f5b00 prev ffff88801a035850
[39899.758663] head ffff8881267b0c80 data ffff8881267b0cc4
[39899.758663] tail ffff8881267b0cdc  end 0xc0
[39899.758663] sk family=2 type=1 proto=6
[39899.758664] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00  ................
[39899.758664] skb headroom: 00000010: 45 00 00 58 ac 43 00 00 40 06 d0 5a 7f 00 00 01  E..X.C..@..Z....
[39899.758665] skb headroom: 00000020: 7f 00 00 01 1f 40 97 04 c8 d0 aa 28 0e c0 5e 8f  .....@.....(..^.
[39899.758665] skb headroom: 00000030: 80 10 01 ff 87 a3 00 00 01 01 08 0a ba 9d 27 90  ..............'.
[39899.758665] skb headroom: 00000040: ba 9d 27 8f                                      ..'.
[39899.758666] skb linear:   00000000: 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d  HTTP/1.1 200 OK.
[39899.758666] skb linear:   00000010: 0a 53 65 72 76 65 72 3a                          .Server:
[39899.758666] skb fraglist[ 0]:
[39899.758667] skb len=12 headroom=36 headlen=12 tailroom=16
[39899.758667] mac=(2,14) net=(16,20) trans=36
[39899.758667] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[39899.758668] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[39899.758668] hash(0x4b6d10bf sw=0 l4=1) proto=0x0800 pkttype=0 iif=1
[39899.758668] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[39899.758669] head ffff8881267b0480 data ffff8881267b04a4
[39899.758669] tail ffff8881267b04b0  end 0x40
[39899.758669] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00  ................
[39899.758670] skb headroom: 00000010: 45 00 00 20 ac 43 00 07 40 06 d0 8b 7f 00 00 01  E.. .C..@.......
[39899.758671] skb headroom: 00000020: 7f 00 00 01                                      ....
[39899.758671] skb linear:   00000000: 20 6e 67 69 6e 78 2f 31 2e 32 32 2e               nginx/1.22.
[39899.758671] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

[39899.758672] [1] #1 sk ffff88801a035780 skb ffff8881254f5b00: RECV QUEUE
[39899.758673] skb len=36 headroom=68 headlen=24 tailroom=0
[39899.758673] mac=(2,14) net=(16,20) trans=36
[39899.758673] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[39899.758673] csum(0x851a2de7 ip_summed=2 complete_sw=1 valid=1 level=0)
[39899.758674] hash(0x4b6d10bf sw=0 l4=1) proto=0x0800 pkttype=0 iif=1
[39899.758674] flist ffff8881b70aee00 next ffff8881b57fc000 prev ffff8881267b0380
[39899.758675] head ffff8881254f5100 data ffff8881254f5144
[39899.758675] tail ffff8881254f515c  end 0xc0
[39899.758675] sk family=2 type=1 proto=6
[39899.758676] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00  ................
[39899.758676] skb headroom: 00000010: 45 00 00 58 ac 44 00 00 40 06 d0 59 7f 00 00 01  E..X.D..@..Y....
[39899.758676] skb headroom: 00000020: 7f 00 00 01 1f 40 97 04 c8 d0 aa 4c 0e c0 5e 8f  .....@.....L..^.
[39899.758677] skb headroom: 00000030: 80 10 01 ff 3d 1f 00 00 01 01 08 0a ba 9d 27 90  ....=.........'.
[39899.758678] skb headroom: 00000040: ba 9d 27 8f                                      ..'.
[39899.758678] skb linear:   00000000: 30 0d 0a 44 61 74 65 3a 20 54 68 75 2c 20 30 39  0..Date: Thu, 09
[39899.758678] skb linear:   00000010: 20 46 65 62 20 32 30 32                           Feb 202
[39899.758679] skb fraglist[ 0]:
[39899.758679] skb len=12 headroom=36 headlen=12 tailroom=16
[39899.758679] mac=(2,14) net=(16,20) trans=36
[39899.758679] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[39899.758680] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[39899.758680] hash(0x4b6d10bf sw=0 l4=1) proto=0x0800 pkttype=0 iif=1
[39899.758681] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[39899.758681] head ffff8881254f5c00 data ffff8881254f5c24
[39899.758681] tail ffff8881254f5c30  end 0x40
[39899.758682] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00  ................
[39899.758682] skb headroom: 00000010: 45 00 00 20 ac 44 00 07 40 06 d0 8a 7f 00 00 01  E.. .D..@.......
[39899.758682] skb headroom: 00000020: 7f 00 00 01                                      ....
[39899.758683] skb linear:   00000000: 33 20 30 39 3a 30 33 3a 31 32 20 47              3 09:03:12 G
[39899.758683] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

Since we do not allocate new SKB on RX when using loopback, we would end up with an SKB, which was previously crafted by the segmentation engines on TX path.

When performing in-place http/1 -> http/2 transformation we would reuse this SKB, but since the headroom is tailored to fit IPv4 + TCP only, we won't be able to push anything else in there.

When the backend is on the other VM (see the setup in the beginning), we should not hit this problem because SKBs are allocated in advance by the driver on the network adapter.

Example RX data path for virtio-net adapter:

asm_common_interrupt()
  common_interrupt()
    irq_exit_rcu()
      __irq_exit_rcu()
        invoke_softirq()
          do_softirq_own_stack()
            run_on_irqstack_cond()
              __run_on_irqstack()
                asm_call_on_stack()
                  __do_softirq()
                    net_rx_action()
                      napi_poll()
                        virtnet_poll()       drivers/net/virtio_net.c
                          virtnet_receive()
                            receive_buf()
                               ┌────────────────────┬────────────────────────┐
           enter GRO engine -> │ napi_gro_receive() ├─> dev_gro_receive()    │
                               ├────────────────────┘          │             │
                               │                               V             │
                               │                       inet_gro_receive()    │
                               │                               │             │
                               │                               v             │
                               │                       ┌───────────────────┐ │
                               │                       │ tcp4_gro_receive()│ │
                               │                       └───────────────────┘ │
                               └──────┬──────────────────────────────────────┘
                                      V
                                  napi_skb_finish()
                                    gro_normal_one()
                                      gro_normal_list()
                                        netif_receive_skb_internal() <- enqueue to backlog
Receiving parts of the response from the other VM:

segment #1

[54283.574253] [3] sk 0000000000000000 skb ffff88802909f000: VIRTIO
[54283.574253] skb len=76 headroom=78 headlen=76 tailroom=38
[54283.574254] mac=(64,-64) net=(0,-1) trans=-1
[54283.574254] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[54283.574254] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[54283.574254] hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0
[54283.574255] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[54283.574255] head ffff88802909f100 data ffff88802909f14e
[54283.574255] tail ffff88802909f19a  end 0xc0
[54283.574256] dev name=enp10s0 feat=0x0x0000010000174a29
[54283.574256] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574256] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574257] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574257] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574258] skb headroom: 00000040: 52 54 00 87 6a 38 52 54 00 4a 4e 7e 08 00        RT..j8RT.JN~..
[54283.574258] skb linear:   00000000: 45 00 00 4c d2 bc 20 00 40 06 a7 0b c0 a8 2f 9f  E..L.. .@...../.
[54283.574258] skb linear:   00000010: c0 a8 2f f4 1f 40 be 34 22 4c c0 ff e2 ab 98 02  ../..@.4"L......
[54283.574259] skb linear:   00000020: 80 10 01 ff 19 da 00 00 01 01 08 0a d7 06 6e 3f  ..............n?
[54283.574259] skb linear:   00000030: 45 6c 1f ae 48 54 54 50 2f 31 2e 31 20 32 30 30  El..HTTP/1.1 200
[54283.574260] skb linear:   00000040: 20 4f 4b 0d 0a 53 65 72 76 65 72 3a               OK..Server:
[54283.574261] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574261] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574261] skb tailroom: 00000020: 00 00 00 00 00 00                                ......

segment #2

[54283.574289] [3] sk 0000000000000000 skb ffff88802909fb00: VIRTIO
[54283.574290] skb len=32 headroom=78 headlen=32 tailroom=82
[54283.574290] mac=(64,-64) net=(0,-1) trans=-1
[54283.574290] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[54283.574290] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[54283.574291] hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0
[54283.574291] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[54283.574291] head ffff88802909fc00 data ffff88802909fc4e
[54283.574292] tail ffff88802909fc6e  end 0xc0
[54283.574292] dev name=enp10s0 feat=0x0x0000010000174a29
[54283.574292] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574293] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574293] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574293] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574294] skb headroom: 00000040: 52 54 00 87 6a 38 52 54 00 4a 4e 7e 08 00        RT..j8RT.JN~..
[54283.574294] skb linear:   00000000: 45 00 00 20 d2 bc 00 07 40 06 c7 30 c0 a8 2f 9f  E.. ....@..0../.
[54283.574294] skb linear:   00000010: c0 a8 2f f4 20 6e 67 69 6e 78 2f 31 2e 32 32 2e  ../. nginx/1.22.
[54283.574295] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574295] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574296] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574296] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574296] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574297] skb tailroom: 00000050: 00 00                                            ..

. . .

first segment with the body:

[54283.575385] [3] sk 0000000000000000 skb ffff88800d3e9700: VIRTIO
[54283.575385] skb len=32 headroom=78 headlen=32 tailroom=82
[54283.575385] mac=(64,-64) net=(0,-1) trans=-1
[54283.575386] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[54283.575386] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[54283.575386] hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0
[54283.575387] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[54283.575387] head ffff88800d3e9800 data ffff88800d3e984e
[54283.575387] tail ffff88800d3e986e  end 0xc0
[54283.575388] dev name=enp10s0 feat=0x0x0000010000174a29
[54283.575388] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575388] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575389] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575389] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575390] skb headroom: 00000040: 52 54 00 87 6a 38 52 54 00 4a 4e 7e 08 00        RT..j8RT.JN~..
[54283.575390] skb linear:   00000000: 45 00 00 20 d2 c2 00 07 40 06 c7 2a c0 a8 2f 9f  E.. ....@..*../.
[54283.575390] skb linear:   00000010: c0 a8 2f f4 71 71 71 71 71 71 71 71 71 71 71 71  ../.qqqqqqqqqqqq
[54283.575391] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575391] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575392] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575392] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575392] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575393] skb tailroom: 00000050: 00 00                                            ..

last two segments containing response body:

[54283.575605] [3] sk 0000000000000000 skb ffff88800d3d9b00: VIRTIO
[54283.575606] skb len=32 headroom=78 headlen=32 tailroom=82
[54283.575606] mac=(64,-64) net=(0,-1) trans=-1
[54283.575606] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[54283.575607] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[54283.575607] hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0
[54283.575607] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[54283.575607] head ffff88800d3d9c00 data ffff88800d3d9c4e
[54283.575608] tail ffff88800d3d9c6e  end 0xc0
[54283.575608] dev name=enp10s0 feat=0x0x0000010000174a29
[54283.575608] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575609] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575609] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575609] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575610] skb headroom: 00000040: 52 54 00 87 6a 38 52 54 00 4a 4e 7e 08 00        RT..j8RT.JN~..
[54283.575610] skb linear:   00000000: 45 00 00 20 d2 c5 00 07 40 06 c7 27 c0 a8 2f 9f  E.. ....@..'../.
[54283.575611] skb linear:   00000010: c0 a8 2f f4 77 77 77 77 77 77 77 77 77 77 77 77  ../.wwwwwwwwwwww
[54283.575611] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575611] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575612] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575612] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575612] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575613] skb tailroom: 00000050: 00 00                                            ..
[54283.575641] [3] sk 0000000000000000 skb ffff88800d3d9300: VIRTIO
[54283.575641] skb len=57 headroom=78 headlen=57 tailroom=57
[54283.575641] mac=(64,-64) net=(0,-1) trans=-1
[54283.575642] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[54283.575642] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[54283.575642] hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=0
[54283.575643] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[54283.575643] head ffff88800d3d9400 data ffff88800d3d944e
[54283.575644] tail ffff88800d3d9487  end 0xc0
[54283.575644] dev name=enp10s0 feat=0x0x0000010000174a29
[54283.575644] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575645] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575645] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575646] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575646] skb headroom: 00000040: 52 54 00 87 6a 38 52 54 00 4a 4e 7e 08 00        RT..j8RT.JN~..
[54283.575646] skb linear:   00000000: 45 00 00 39 d2 c6 00 00 40 06 c7 14 c0 a8 2f 9f  E..9....@...../.
[54283.575647] skb linear:   00000010: c0 a8 2f f4 1f 40 be 34 22 4c c2 67 e2 ab 98 02  ../..@.4"L.g....
[54283.575647] skb linear:   00000020: 80 18 01 ff b3 a5 00 00 01 01 08 0a d7 06 6e 3f  ..............n?
[54283.575647] skb linear:   00000030: 45 6c 1f af 77 77 77 77 0a                       El..wwww.
[54283.575648] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575648] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575649] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.575649] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00                       .........

Contents of the receive queue of Tempesta server socket (sk->sk_receive_queue) on tcp_data_ready() -> ss_tcp_data_ready() call:

[54283.574559] [3] #0 sk ffff8881233671c0 skb ffff88802909fb00: RECV QUEUE
[54283.574559] skb len=36 headroom=130 headlen=24 tailroom=0
[54283.574560] mac=(64,14) net=(78,20) trans=98
[54283.574560] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[54283.574560] csum(0x1546bbd8 ip_summed=2 complete_sw=1 valid=1 level=0)
[54283.574560] hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=4
[54283.574561] flist ffff888120dcfe00 next ffff88802909f600 prev ffff888123367290
[54283.574561] head ffff88802909f100 data ffff88802909f182
[54283.574561] tail ffff88802909f19a  end 0xc0
[54283.574562] sk family=2 type=1 proto=6
[54283.574562] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574562] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574563] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574563] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574564] skb headroom: 00000040: 52 54 00 87 6a 38 52 54 00 4a 4e 7e 08 00 45 00  RT..j8RT.JN~..E.
[54283.574564] skb headroom: 00000050: 00 58 d2 bc 00 00 40 06 c6 ff c0 a8 2f 9f c0 a8  .X....@...../...
[54283.574564] skb headroom: 00000060: 2f f4 1f 40 be 34 22 4c c0 ff e2 ab 98 02 80 10  /..@.4"L........
[54283.574565] skb headroom: 00000070: 01 ff 19 da 00 00 01 01 08 0a d7 06 6e 3f 45 6c  ............n?El
[54283.574565] skb headroom: 00000080: 1f ae                                            ..
[54283.574565] skb linear:   00000000: 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d  HTTP/1.1 200 OK.
[54283.574566] skb linear:   00000010: 0a 53 65 72 76 65 72 3a                          .Server:
[54283.574566] skb fraglist[ 0]:
[54283.574566] skb len=12 headroom=98 headlen=12 tailroom=82
[54283.574566] mac=(64,14) net=(78,20) trans=98
[54283.574567] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[54283.574567] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[54283.574567] hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=4
[54283.574568] flist 0000000000000000 next 0000000000000000 prev 0000000000000000
[54283.574568] head ffff88802909fc00 data ffff88802909fc62
[54283.574568] tail ffff88802909fc6e  end 0xc0
[54283.574569] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574569] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574569] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574570] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574570] skb headroom: 00000040: 52 54 00 87 6a 38 52 54 00 4a 4e 7e 08 00 45 00  RT..j8RT.JN~..E.
[54283.574571] skb headroom: 00000050: 00 20 d2 bc 00 07 40 06 c7 30 c0 a8 2f 9f c0 a8  . ....@..0../...
[54283.574571] skb headroom: 00000060: 2f f4                                            /.
[54283.574571] skb linear:   00000000: 20 6e 67 69 6e 78 2f 31 2e 32 32 2e               nginx/1.22.
[54283.574572] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574572] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574572] skb tailroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574573] skb tailroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574573] skb tailroom: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[54283.574574] skb tailroom: 00000050: 00 00                                            ..

So in this scenario we might have enough headroom for in-place transformation, TLS headers.

I guess the actual window size doesn't play significant role here, since we mostly depend on available headroom, which is determined by GRO engine.

krizhanovsky commented 1 year ago

We're going to go upstream with our TLS, so it must work correctly on loopback with any MTU. We also should not to reject working on a small MTU, which allows at least 1 byte of payload since the TLS can carry various payloads.

Why not to make a slow path to allocate a new skb or rearrange the headers of an existing one, received from the tx loopback path, to fix the problem?

If we have too small MSS to issue a TLS record, then it seems we need to postpone the transmission. This case must also be tested.

s0nx commented 1 year ago

So almost all the problems reported here are caused by a lack of headroom space, as i've explained in the previous comment. I don't see some warnings, that were reported originally:

  1. [ 130.652572] WARNING: CPU: 0 PID: 12 at /home/ubuntu/tempesta_dev/fw/http_frame.c:232 tfw_h2_context_clear+0x1b/0x30 [tempesta_fw]
  2. WARNING: CPU: 2 PID: 15701 at /root/tempesta/fw/http.c:6500 tfw_http_msg_process_generic+0x409/0x670 [tempesta_fw] This is most likely because of these patches

This problem is caused by lack of memory and is not related to the aforementioned headroom issues.

krizhanovsky commented 1 year ago

We discussed today that for the skb issue with small MTU we should consider the changes in #1804 or, more probably, #1394, and more the skb scanning to the TLS encryption routine.

https://github.com/tempesta-tech/tempesta/wiki/Debugging-and-troubleshooting#kmemleak mentions Warning: Cannot grow HTTP data string . We discussed that the problem isn't in allocating TfwStr for each body chunks, but it still must be fixed - we should not allocate TfwStrs for the body chunks.

s0nx commented 1 year ago

I believe this problem is now fixed by PR 1837 by @const-t

s0nx commented 1 year ago

One problem remains here. When running the following test: curl --no-progress-meter --silent --show-error --output /dev/null --http2-prior-knowledge --insecure --parallel --parallel-immediate --parallel-max 10 --max-time 10 'https://127.0.0.1/[1-100]' on an iface with low MTU (80) and large enough payload (>= 64K), curl might hang with the following messages:

curl: (28) Operation timed out after 10000 milliseconds with 57322 out of 65537 bytes received
curl: (28) Operation timed out after 10000 milliseconds with 0 bytes received
curl: (28) Operation timed out after 10000 milliseconds with 0 bytes received
curl: (28) Operation timed out after 10000 milliseconds with 0 bytes received
curl: (28) Operation timed out after 10000 milliseconds with 0 bytes received
curl: (28) Operation timed out after 10001 milliseconds with 0 bytes received
curl: (28) Operation timed out after 10000 milliseconds with 0 bytes received
curl: (28) Operation timed out after 10000 milliseconds with 0 bytes received
curl: (28) Operation timed out after 10000 milliseconds with 0 bytes received
curl: (28) Operation timed out after 10000 milliseconds with 0 bytes received

I've debugged the issue, and it actually reveals itself when new sk_buff is being allocated inside tfw_tls_encrypt():

tfw_tls_encrypt()
    ss_skb_expand_head_tail(skb_tail->next, skb_tail, 0, TTLS_TAG_LEN)
        __new_pgfrag()
            __extend_pgfrags() -> ss_skb_alloc()

Since we're iterating sk->sk_write_queue when performing this allocation, we should fill some SKB fields here like seq numbers, CB, etc. We should also propagate correct seq numbers to all the SKBs after the newly added one.

This issue should be fixed by @EvgeniiMekhanik in #1845 i guess.