Closed krizhanovsky closed 3 weeks ago
Found another suspicious KASAN bug:
[ 9877.219407] [tdb] Close table 'client0.tdb'
[ 9877.256909] [tdb] Close table 'sessions0.tdb'
[ 9877.344305] ==================================================================
[ 9877.346328] BUG: KASAN: use-after-free in _raw_spin_lock+0x6d/0xd0
[ 9877.347890] Write of size 4 at addr ffff88810349e318 by task apache2/24588
[ 9877.351382] CPU: 3 PID: 24588 Comm: apache2 Kdump: loaded Tainted: G OE 5.10.35+ #1
[ 9877.353120] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 9877.354943] Call Trace:
[ 9877.356671] <IRQ>
[ 9877.358487] dump_stack+0x96/0xc4
[ 9877.360693] print_address_description.constprop.0+0x21/0x220
[ 9877.362799] ? _raw_spin_lock_irqsave+0x8e/0xf0
[ 9877.364630] ? _raw_write_unlock_bh+0x30/0x30
[ 9877.366658] ? __kernel_text_address+0x12/0x40
[ 9877.368481] ? unwind_get_return_address+0x36/0x50
[ 9877.370231] ? _raw_spin_lock+0x6d/0xd0
[ 9877.372079] kasan_report.cold+0x20/0x37
[ 9877.373819] ? _raw_spin_lock+0x6d/0xd0
[ 9877.375573] check_memory_region+0x14d/0x1a0
[ 9877.377384] __kasan_check_write+0x14/0x20
[ 9877.379092] _raw_spin_lock+0x6d/0xd0
[ 9877.380781] ? _raw_write_lock_irqsave+0xe0/0xe0
[ 9877.382589] tdb_rec_get_alloc+0x95/0x210 [tempesta_db]
[ 9877.384449] ? sk_prot_alloc+0x4c/0x190
[ 9877.386106] ? sk_clone_lock+0x47/0x880
[ 9877.387724] ? tdb_close+0x70/0x70 [tempesta_db]
[ 9877.389390] ? ip6_input+0x15f/0x170
[ 9877.391026] ? ipv6_rcv+0x236/0x250
[ 9877.392798] ? __netif_receive_skb_one_core+0xeb/0x140
[ 9877.394445] ? __netif_receive_skb+0x26/0xb0
[ 9877.396035] ? __hash_calc+0x90/0x190 [tempesta_lib]
[ 9877.397667] tfw_client_obtain+0x266/0x360 [tempesta_fw]
[ 9877.399380] ? tfw_client_start+0x50/0x50 [tempesta_fw]
[ 9877.401174] ? __tcp_push_pending_frames+0x5f/0x1a0
[ 9877.402732] ? tcp_push+0x209/0x2b0
[ 9877.404274] ? tcp_sendmsg_locked+0x13e3/0x15e0
[ 9877.405808] ? _raw_spin_lock_irqsave+0x8e/0xf0
[ 9877.407333] ? tfw_client_obtain+0x360/0x360 [tempesta_fw]
[ 9877.408981] ? tfw_cfg_exit+0x10/0x10 [tempesta_fw]
[ 9877.410534] ? enqueue_timer+0xbd/0x1c0
[ 9877.412146] ? queued_spin_unlock+0x10/0x10 [tempesta_fw]
[ 9877.413666] ? __ipv6_addr_type+0xa5/0x140
[ 9877.415134] tfw_classify_conn_estab+0x17e/0x430 [tempesta_fw]
[ 9877.416579] ? __hrtimer_init+0xce/0x100
[ 9877.418017] ? frang_conn_limit+0x420/0x420 [tempesta_fw]
[ 9877.419418] ? memset+0x3c/0x50
[ 9877.421032] tempesta_new_clntsk+0x4d/0x80
[ 9877.422526] tcp_v6_syn_recv_sock+0x743/0xed0
[ 9877.424115] ? tcp_v6_do_rcv+0x780/0x780
[ 9877.425537] ? kfree_skbmem+0x4b/0x200
[ 9877.426906] ? tcp_parse_options+0x444/0x650
[ 9877.428250] tcp_check_req+0x27b/0x940
[ 9877.429571] ? tcp_child_process+0x2e0/0x2e0
[ 9877.430875] ? memmove+0x4e/0x60
[ 9877.432264] tcp_v6_rcv+0x10a1/0x1660
[ 9877.433593] ? tdb_init_mappings+0x151/0x190 [tempesta_db]
[ 9877.434820] ? tcp_v6_syn_recv_sock+0xed0/0xed0
[ 9877.436016] ip6_protocol_deliver_rcu+0x18b/0x880
[ 9877.437228] ip6_input+0x15f/0x170
[ 9877.438449] ? ip6_input_finish+0x20/0x20
[ 9877.439863] ? tfw_ipv6_nf_hook+0x1a3/0x230 [tempesta_fw]
[ 9877.441403] ? nf_hook_slow+0xb3/0xe0
[ 9877.442614] ipv6_rcv+0x236/0x250
[ 9877.443819] ? ip6_rcv_finish+0xf0/0xf0
[ 9877.445078] ? ip6_sublist_rcv+0x530/0x530
[ 9877.446310] ? try_to_wake_up+0x41e/0xa70
[ 9877.447560] ? ip6_rcv_finish+0xf0/0xf0
[ 9877.448793] __netif_receive_skb_one_core+0xeb/0x140
[ 9877.450058] ? napi_gro_receive+0x260/0x260
[ 9877.451656] ? __kasan_check_write+0x14/0x20
[ 9877.452891] ? _raw_spin_lock+0x7b/0xd0
[ 9877.454158] __netif_receive_skb+0x26/0xb0
[ 9877.456574] process_backlog+0xfe/0x290
[ 9877.457778] net_rx_action+0x287/0x6b0
[ 9877.459018] ? napi_complete_done+0x2c0/0x2c0
[ 9877.460295] ? switch_fpu_return+0x120/0x120
[ 9877.461919] __do_softirq+0xfe/0x383
[ 9877.463115] asm_call_irq_on_stack+0x12/0x20
[ 9877.464308] </IRQ>
[ 9877.465593] do_softirq_own_stack+0x3d/0x50
[ 9877.466886] do_softirq+0x72/0x90
[ 9877.468081] __local_bh_enable_ip+0x50/0x60
[ 9877.469391] ip6_finish_output2+0x593/0xaa0
[ 9877.470914] ? ip6_fraglist_init+0x360/0x360
[ 9877.472343] ? cgroup_bpf_release+0x400/0x400
[ 9877.473685] ? loopback_xmit+0x15e/0x1b0
[ 9877.474858] __ip6_finish_output+0x20e/0x380
[ 9877.476086] ip6_finish_output+0x2d/0xe0
[ 9877.477264] ip6_output+0x136/0x260
[ 9877.478420] ? ip6_finish_output+0xe0/0xe0
[ 9877.479604] ? netdev_core_pick_tx+0x120/0x120
[ 9877.480772] ? cgroup_bpf_release+0x400/0x400
[ 9877.482073] ip6_xmit+0x5d9/0xad0
[ 9877.483185] ? ip6_finish_output2+0xaa0/0xaa0
[ 9877.484298] ? ip6_dst_check+0xdd/0x1f0
[ 9877.485398] ? __sk_dst_check+0x6d/0xd0
[ 9877.486492] ? inet6_csk_route_socket+0x29e/0x3e0
[ 9877.487661] ? inet6_csk_addr2sockaddr+0xd0/0xd0
[ 9877.488790] ? __ip6_finish_output+0x20e/0x380
[ 9877.489892] inet6_csk_xmit+0x17e/0x220
[ 9877.490988] ? inet6_csk_route_socket+0x3e0/0x3e0
[ 9877.492118] ? __asan_loadN+0xf/0x20
[ 9877.493221] ? __tcp_select_window+0x147/0x370
[ 9877.494330] __tcp_transmit_skb+0xb81/0x1730
[ 9877.495445] ? __tcp_select_window+0x370/0x370
[ 9877.496553] ? __kasan_check_write+0x14/0x20
[ 9877.497676] ? _raw_write_unlock_bh+0x30/0x30
[ 9877.498768] ? tcp_tfw_sk_write_xmit+0x2b/0x1d0
[ 9877.499861] tcp_write_xmit+0x796/0x2860
[ 9877.500987] ? iov_iter_advance+0x12b/0x5f0
[ 9877.502127] ? bpf_skops_hdr_opt_len+0x210/0x210
[ 9877.503288] __tcp_push_pending_frames+0x5f/0x1a0
[ 9877.504384] tcp_push+0x209/0x2b0
[ 9877.505474] tcp_sendmsg_locked+0x13e3/0x15e0
[ 9877.506579] ? skb_do_copy_data_nocache+0x1b0/0x1b0
[ 9877.507690] ? __kasan_check_write+0x14/0x20
[ 9877.508808] ? _raw_spin_lock_bh+0x86/0xe0
[ 9877.509913] tcp_sendmsg+0x2d/0x50
[ 9877.511139] inet6_sendmsg+0x64/0xa0
[ 9877.512235] ? inet6_ioctl+0x1b0/0x1b0
[ 9877.513331] sock_sendmsg+0x80/0xa0
[ 9877.514424] sock_write_iter+0x14c/0x200
[ 9877.515680] ? sock_sendmsg+0xa0/0xa0
[ 9877.516771] ? sk_get_meminfo+0x1a0/0x1a0
[ 9877.517857] ? ptep_set_access_flags+0xd0/0xe0
[ 9877.518972] ? pgd_free+0x130/0x130
[ 9877.520156] new_sync_write+0x3aa/0x3c0
[ 9877.521242] ? new_sync_read+0x3c0/0x3c0
[ 9877.522324] ? __cgroup_bpf_run_filter_getsockopt+0x172/0x470
[ 9877.523511] ? __seccomp_filter+0x12e/0xad0
[ 9877.524636] ? rw_verify_area+0x7c/0x150
[ 9877.525743] vfs_write+0x311/0x3d0
[ 9877.526846] ksys_write+0x138/0x170
[ 9877.527940] ? __ia32_sys_read+0x50/0x50
[ 9877.529043] ? __secure_computing+0x74/0x150
[ 9877.530147] __x64_sys_write+0x42/0x50
[ 9877.531350] do_syscall_64+0x38/0x90
[ 9877.532456] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 9877.533620] RIP: 0033:0x7f3a02fc3887
[ 9877.534735] Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77
51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 9877.537253] RSP: 002b:00007ffc0d84f728 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 9877.538623] RAX: ffffffffffffffda RBX: 00007f39ffee30a0 RCX: 00007f3a02fc3887
[ 9877.539966] RDX: 0000000000000064 RSI: 00007f39ffee3290 RDI: 000000000000000d
[ 9877.541282] RBP: 00007ffc0d84f760 R08: 0000000000000000 R09: 00007f3a02e139d8
[ 9877.542611] R10: 00007ffc0d84f714 R11: 0000000000000246 R12: 00007f39ffee3290
[ 9877.543917] R13: 00007f3a02d64048 R14: 000055fe2b1070f0 R15: 0000000000000000
[ 9877.546558] Allocated by task 24463:
[ 9877.547933] kasan_save_stack+0x23/0x50
[ 9877.549235] __kasan_kmalloc.constprop.0+0xcf/0xe0
[ 9877.550551] kasan_kmalloc+0x9/0x10
[ 9877.551918] kmem_cache_alloc_trace+0x132/0x240
[ 9877.553226] tdb_open+0x300/0x390 [tempesta_db]
[ 9877.554558] tfw_client_start+0x34/0x50 [tempesta_fw]
[ 9877.555908] tfw_ctlfn_state_io+0x3d5/0x4d0 [tempesta_fw]
[ 9877.557287] proc_sys_call_handler+0x227/0x390
[ 9877.558620] proc_sys_write+0x13/0x20
[ 9877.559941] new_sync_write+0x29d/0x3c0
[ 9877.561272] vfs_write+0x311/0x3d0
[ 9877.562580] ksys_write+0xcd/0x170
[ 9877.563877] __x64_sys_write+0x42/0x50
[ 9877.565172] do_syscall_64+0x38/0x90
[ 9877.566464] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 9877.569113] Freed by task 26066:
[ 9877.570441] kasan_save_stack+0x23/0x50
[ 9877.571768] kasan_set_track+0x20/0x30
[ 9877.573109] kasan_set_free_info+0x1f/0x40
[ 9877.574401] __kasan_slab_free+0x116/0x160
[ 9877.575691] kasan_slab_free+0xe/0x10
[ 9877.577029] slab_free_freelist_hook+0x6d/0x120
[ 9877.578349] kfree+0xbe/0x460
[ 9877.579647] tdb_close+0x69/0x70 [tempesta_db]
[ 9877.580990] tfw_client_stop+0x23/0x30 [tempesta_fw]
[ 9877.582343] tfw_mods_stop+0x51/0xe0 [tempesta_fw]
[ 9877.583736] tfw_ctlfn_state_io+0x24a/0x4d0 [tempesta_fw]
[ 9877.585068] proc_sys_call_handler+0x227/0x390
[ 9877.586413] proc_sys_write+0x13/0x20
[ 9877.587737] new_sync_write+0x29d/0x3c0
[ 9877.589036] vfs_write+0x311/0x3d0
[ 9877.590331] ksys_write+0xcd/0x170
[ 9877.591623] __x64_sys_write+0x42/0x50
[ 9877.592919] do_syscall_64+0x38/0x90
[ 9877.594213] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 9877.596865] The buggy address belongs to the object at ffff88810349e300
which belongs to the cache kmalloc-192 of size 192
[ 9877.599694] The buggy address is located 24 bytes inside of
192-byte region [ffff88810349e300, ffff88810349e3c0)
[ 9877.602303] The buggy address belongs to the page:
[ 9877.603605] page:00000000b431c515 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88810349e600 pfn:0x10349e
[ 9877.604957] flags: 0x17ffffc0000200(slab)
[ 9877.606217] raw: 0017ffffc0000200 ffffea00041ee588 ffffea000415a048 ffff888100043540
[ 9877.607553] raw: ffff88810349e600 000000000010000d 00000001ffffffff ffff88810ab3cc01
[ 9877.608876] page dumped because: kasan: bad access detected
[ 9877.610192] page->mem_cgroup:ffff88810ab3cc01
[ 9877.612805] Memory state around the buggy address:
[ 9877.614144] ffff88810349e200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 9877.615553] ffff88810349e280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 9877.616918] >ffff88810349e300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 9877.618236] ^
[ 9877.619544] ffff88810349e380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 9877.620900] ffff88810349e400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 9877.622259] ==================================================================
[ 9877.623627] Disabling lock debugging due to kernel taint
[ 9877.625243] BUG: unable to handle page fault for address: ffffed2da0420f9e
[ 9877.626408] #PF: supervisor read access in kernel mode
[ 9877.627535] #PF: error_code(0x0000) - not-present page
[ 9877.628887] PGD 13ffc9067 P4D 13ffc9067 PUD 0
[ 9877.630364] Oops: 0000 [#1] SMP KASAN NOPTI
[ 9877.631867] CPU: 3 PID: 24588 Comm: apache2 Kdump: loaded Tainted: G B OE 5.10.35+ #1
[ 9877.633383] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 9877.634964] RIP: 0010:check_memory_region+0xe3/0x1a0
[ 9877.636552] Code: 74 f2 41 bb 01 00 00 00 48 85 c0 75 5d 5b 44 89 d8 41 5c 41 5d 5d c3 48 85 d2 74 60 4c 01 e2 eb 09 48 83 c0 01 48 39 d0 74 52 <80> 38 00 74 f2 eb d1
41 bd 08 00 00 00 45 29 dd 4b 8d 54 25 00 eb
[ 9877.639969] RSP: 0018:ffff88810c9891b8 EFLAGS: 00010286
[ 9877.641744] RAX: ffffed2da0420f9e RBX: ffffed2da0420f9f RCX: ffffffffc1600aad
[ 9877.643519] RDX: ffffed2da0420f9f RSI: 0000000000000004 RDI: ffff896d02107cf0
[ 9877.645317] RBP: ffff88810c9891d0 R08: 0000000000000000 R09: ffff896d02107cf3
[ 9877.647169] R10: ffffed2da0420f9e R11: 0000000000000001 R12: ffffed2da0420f9e
[ 9877.648953] R13: ffff88810c989270 R14: ffff8880020f5e70 R15: ffff896d02107cf0
[ 9877.650728] FS: 00007f3a02e36780(0000) GS:ffff88810c980000(0000) knlGS:0000000000000000
[ 9877.652588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9877.654393] CR2: ffffed2da0420f9e CR3: 0000000102b3c000 CR4: 0000000000350ee0
[ 9877.656268] Call Trace:
[ 9877.658128] <IRQ>
[ 9877.659999] __asan_loadN+0xf/0x20
[ 9877.661875] tdb_htrie_descend+0x6d/0x130 [tempesta_db]
[ 9877.663746] tdb_htrie_lookup+0xbe/0x100 [tempesta_db]
[ 9877.665601] ? tdb_htrie_insert+0x790/0x790 [tempesta_db]
[ 9877.667475] ? _raw_spin_lock+0x6d/0xd0
[ 9877.669372] tdb_rec_get+0x85/0xf0 [tempesta_db]
[ 9877.671258] ? tdb_entry_get_room+0xb0/0xb0 [tempesta_db]
[ 9877.673123] ? _raw_write_lock_irqsave+0xe0/0xe0
[ 9877.674980] tdb_rec_get_alloc+0xb5/0x210 [tempesta_db]
[ 9877.676852] ? sk_prot_alloc+0x4c/0x190
[ 9877.678688] ? tdb_close+0x70/0x70 [tempesta_db]
[ 9877.680560] ? ip6_input+0x15f/0x170
[ 9877.682387] ? ipv6_rcv+0x236/0x250
[ 9877.684204] ? __netif_receive_skb_one_core+0xeb/0x140
[ 9877.685942] ? __netif_receive_skb+0x26/0xb0
[ 9877.687632] ? __hash_calc+0x90/0x190 [tempesta_lib]
[ 9877.689449] tfw_client_obtain+0x266/0x360 [tempesta_fw]
[ 9877.691179] ? tfw_client_start+0x50/0x50 [tempesta_fw]
[ 9877.692851] ? __tcp_push_pending_frames+0x5f/0x1a0
[ 9877.694519] ? tcp_push+0x209/0x2b0
[ 9877.696285] ? tcp_sendmsg_locked+0x13e3/0x15e0
[ 9877.697937] ? _raw_spin_lock_irqsave+0x8e/0xf0
[ 9877.699636] ? tfw_client_obtain+0x360/0x360 [tempesta_fw]
[ 9877.701432] ? tfw_cfg_exit+0x10/0x10 [tempesta_fw]
[ 9877.703094] ? enqueue_timer+0xbd/0x1c0
[ 9877.704750] ? queued_spin_unlock+0x10/0x10 [tempesta_fw]
[ 9877.706402] ? __ipv6_addr_type+0xa5/0x140
[ 9877.708045] tfw_classify_conn_estab+0x17e/0x430 [tempesta_fw]
[ 9877.709671] ? __hrtimer_init+0xce/0x100
[ 9877.711450] ? frang_conn_limit+0x420/0x420 [tempesta_fw]
[ 9877.713061] ? memset+0x3c/0x50
[ 9877.714658] tempesta_new_clntsk+0x4d/0x80
[ 9877.716252] tcp_v6_syn_recv_sock+0x743/0xed0
[ 9877.717877] ? tcp_v6_do_rcv+0x780/0x780
[ 9877.719485] ? kfree_skbmem+0x4b/0x200
[ 9877.721279] ? tcp_parse_options+0x444/0x650
[ 9877.722811] tcp_check_req+0x27b/0x940
[ 9877.724309] ? tcp_child_process+0x2e0/0x2e0
[ 9877.725677] ? memmove+0x4e/0x60
[ 9877.727097] tcp_v6_rcv+0x10a1/0x1660
[ 9877.728496] ? tdb_init_mappings+0x151/0x190 [tempesta_db]
[ 9877.729861] ? tcp_v6_syn_recv_sock+0xed0/0xed0
[ 9877.731295] ip6_protocol_deliver_rcu+0x18b/0x880
[ 9877.732676] ip6_input+0x15f/0x170
[ 9877.734050] ? ip6_input_finish+0x20/0x20
[ 9877.735396] ? tfw_ipv6_nf_hook+0x1a3/0x230 [tempesta_fw]
[ 9877.736704] ? nf_hook_slow+0xb3/0xe0
[ 9877.737966] ipv6_rcv+0x236/0x250
[ 9877.739201] ? ip6_rcv_finish+0xf0/0xf0
[ 9877.740448] ? ip6_sublist_rcv+0x530/0x530
[ 9877.741700] ? try_to_wake_up+0x41e/0xa70
[ 9877.743382] ? ip6_rcv_finish+0xf0/0xf0
[ 9877.744929] __netif_receive_skb_one_core+0xeb/0x140
[ 9877.746348] ? napi_gro_receive+0x260/0x260
[ 9877.747582] ? __kasan_check_write+0x14/0x20
[ 9877.748814] ? _raw_spin_lock+0x7b/0xd0
[ 9877.750069] __netif_receive_skb+0x26/0xb0
[ 9877.751282] process_backlog+0xfe/0x290
[ 9877.752483] net_rx_action+0x287/0x6b0
[ 9877.753673] ? napi_complete_done+0x2c0/0x2c0
[ 9877.754867] ? switch_fpu_return+0x120/0x120
[ 9877.756058] __do_softirq+0xfe/0x383
[ 9877.757251] asm_call_irq_on_stack+0x12/0x20
[ 9877.758441] </IRQ>
[ 9877.759624] do_softirq_own_stack+0x3d/0x50
[ 9877.760818] do_softirq+0x72/0x90
[ 9877.761995] __local_bh_enable_ip+0x50/0x60
[ 9877.763171] ip6_finish_output2+0x593/0xaa0
[ 9877.764340] ? ip6_fraglist_init+0x360/0x360
[ 9877.765507] ? cgroup_bpf_release+0x400/0x400
[ 9877.766677] ? loopback_xmit+0x15e/0x1b0
[ 9877.767840] __ip6_finish_output+0x20e/0x380
[ 9877.768999] ip6_finish_output+0x2d/0xe0
[ 9877.770151] ip6_output+0x136/0x260
[ 9877.771292] ? ip6_finish_output+0xe0/0xe0
[ 9877.772435] ? netdev_core_pick_tx+0x120/0x120
[ 9877.773573] ? cgroup_bpf_release+0x400/0x400
[ 9877.774850] ip6_xmit+0x5d9/0xad0
[ 9877.775970] ? ip6_finish_output2+0xaa0/0xaa0
[ 9877.777101] ? ip6_dst_check+0xdd/0x1f0
[ 9877.778201] ? __sk_dst_check+0x6d/0xd0
[ 9877.779295] ? inet6_csk_route_socket+0x29e/0x3e0
[ 9877.780437] ? inet6_csk_addr2sockaddr+0xd0/0xd0
[ 9877.781587] ? __ip6_finish_output+0x20e/0x380
[ 9877.782690] inet6_csk_xmit+0x17e/0x220
[ 9877.783787] ? inet6_csk_route_socket+0x3e0/0x3e0
[ 9877.784936] ? __asan_loadN+0xf/0x20
[ 9877.786040] ? __tcp_select_window+0x147/0x370
[ 9877.787149] __tcp_transmit_skb+0xb81/0x1730
[ 9877.788278] ? __tcp_select_window+0x370/0x370
[ 9877.789448] ? __kasan_check_write+0x14/0x20
[ 9877.790581] ? _raw_write_unlock_bh+0x30/0x30
[ 9877.791728] ? tcp_tfw_sk_write_xmit+0x2b/0x1d0
[ 9877.792968] tcp_write_xmit+0x796/0x2860
[ 9877.794075] ? iov_iter_advance+0x12b/0x5f0
[ 9877.795230] ? bpf_skops_hdr_opt_len+0x210/0x210
[ 9877.796321] __tcp_push_pending_frames+0x5f/0x1a0
[ 9877.797416] tcp_push+0x209/0x2b0
[ 9877.798506] tcp_sendmsg_locked+0x13e3/0x15e0
[ 9877.799606] ? skb_do_copy_data_nocache+0x1b0/0x1b0
[ 9877.800715] ? __kasan_check_write+0x14/0x20
[ 9877.801829] ? _raw_spin_lock_bh+0x86/0xe0
[ 9877.802936] tcp_sendmsg+0x2d/0x50
[ 9877.804031] inet6_sendmsg+0x64/0xa0
[ 9877.805124] ? inet6_ioctl+0x1b0/0x1b0
[ 9877.806255] sock_sendmsg+0x80/0xa0
[ 9877.807346] sock_write_iter+0x14c/0x200
[ 9877.808433] ? sock_sendmsg+0xa0/0xa0
[ 9877.809520] ? sk_get_meminfo+0x1a0/0x1a0
[ 9877.810623] ? ptep_set_access_flags+0xd0/0xe0
[ 9877.811713] ? pgd_free+0x130/0x130
[ 9877.812796] new_sync_write+0x3aa/0x3c0
[ 9877.813878] ? new_sync_read+0x3c0/0x3c0
[ 9877.814960] ? __cgroup_bpf_run_filter_getsockopt+0x172/0x470
[ 9877.816063] ? __seccomp_filter+0x12e/0xad0
[ 9877.817167] ? rw_verify_area+0x7c/0x150
[ 9877.818266] vfs_write+0x311/0x3d0
[ 9877.819361] ksys_write+0x138/0x170
[ 9877.820452] ? __ia32_sys_read+0x50/0x50
[ 9877.821545] ? __secure_computing+0x74/0x150
[ 9877.822640] __x64_sys_write+0x42/0x50
[ 9877.823732] do_syscall_64+0x38/0x90
[ 9877.824818] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 9877.825944] RIP: 0033:0x7f3a02fc3887
[ 9877.827050] Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77
51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 9877.829502] RSP: 002b:00007ffc0d84f728 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 9877.830777] RAX: ffffffffffffffda RBX: 00007f39ffee30a0 RCX: 00007f3a02fc3887
[ 9877.832068] RDX: 0000000000000064 RSI: 00007f39ffee3290 RDI: 000000000000000d
[ 9877.833367] RBP: 00007ffc0d84f760 R08: 0000000000000000 R09: 00007f3a02e139d8
[ 9877.834675] R10: 00007ffc0d84f714 R11: 0000000000000246 R12: 00007f39ffee3290
[ 9877.835979] R13: 00007f3a02d64048 R14: 000055fe2b1070f0 R15: 0000000000000000
[ 9877.837315] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) tcp_diag inet_diag veth tls sha256_ssse3 sha512_ssse3 btrfs blake2b_generic zstd_compress xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_counter nft_chain_nat nf_nat nf_c
onntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink overlay intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
ppdev parport_pc parport i2c_piix4 e1000 qemu_fw_cfg floppy binfmt_misc dm_multipath sch_fq_codel scsi_dh_rdac scsi_dh_emc scsi_dh_alua efi_pstore msr ip_tables x_tables autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 multipath linear bochs_drm drm_vram_helper drm_ttm_he
lper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec aesni_intel glue_helper crypto_simd psmouse input_leds cryptd drm serio_raw
[ 9877.837435] pata_acpi
[ 9877.838240] [tdb] Close table 'cache0.tdb'
[ 9877.848914] mac_hid [last unloaded: tempesta_lib]
[ 9877.849020] CR2: ffffed2da0420f9e
@krizhanovsky The root cause is located.
The message sending may happen in a softirq on another CPU, which may happen before the next statement referencing msg->ss_flags
. This means it may access a freed message, triggering a KASAN report.
Conditions are set up to reproduce the above scenario, ensuring that the send occurs on the next CPU and delaying the current CPU by calculating the MD5 hash to ensure the message is destroyed when referenced again.
diff --git a/fw/http.c b/fw/http.c
index 38750670..d6ba2995 100644
--- a/fw/http.c
+++ b/fw/http.c
@@ -997,6 +997,7 @@ tfw_http_resp_pair_free_and_put_conn(void *opaque_data)
BUG_ON(!req || !req->conn);
tfw_connection_put(req->conn);
tfw_http_resp_pair_free(req);
+ printk("--> msg_free\n");
}
/*
diff --git a/fw/sock.c b/fw/sock.c
index eaff7080..917f06f3 100644
--- a/fw/sock.c
+++ b/fw/sock.c
@@ -570,7 +570,7 @@ ss_send(struct sock *sk, struct sk_buff **skb_head, int flags)
* leakage, so we never use synchronous sending.
*/
sock_hold(sk);
- if (ss_wq_push(&sw, cpu)) {
+ if (ss_wq_push(&sw, (cpu+1)%4)) {
T_DBG2("Cannot schedule socket %p for transmission"
" (queue size %d)\n", sk,
tfw_wq_size(&per_cpu(si_wq, cpu)));
diff --git a/fw/tls.c b/fw/tls.c
index 51d9ebcb..afb7e047 100644
--- a/fw/tls.c
+++ b/fw/tls.c
@@ -787,6 +787,25 @@ tfw_tls_conn_drop(TfwConn *c)
tfw_conn_hook_call(TFW_FSM_HTTP, c, conn_drop);
}
+bool
+md5_hash(char *result, char* data, size_t len)
+{
+ struct shash_desc *desc;
+ struct crypto_shash *shash = crypto_alloc_shash("md5", 0, 0);
+ int size = sizeof(struct shash_desc) + crypto_shash_descsize(shash);
+ desc = (struct shash_desc*) kmalloc(size, GFP_KERNEL);
+
+ if(desc->tfm == NULL)
+ return false;
+
+ crypto_shash_init(desc);
+ crypto_shash_update(desc, data, len);
+ crypto_shash_final(desc, result);
+ crypto_free_shash(desc->tfm);
+
+ return true;
+}
+
/**
* Send the @msg skbs as is - tcp_write_xmit() will care about encryption,
* but attach TLS alert message at the end of the skb list to notify the peer
@@ -798,6 +817,10 @@ tfw_tls_conn_send(TfwConn *c, TfwMsg *msg)
int r;
TlsCtx *tls = tfw_tls_context(c);
+ int i;
+ char data[100];
+ char result[100];
+
T_DBG("TLS %lu bytes (%u bytes)"
" are to be sent on conn=%pK/sk_write_xmit=%pK ready=%d\n",
msg->len, tls->io_out.msglen + TLS_HEADER_SIZE, c,
@@ -813,6 +836,12 @@ tfw_tls_conn_send(TfwConn *c, TfwMsg *msg)
if (r)
return r;
+ for (i = 0; i < 10000; i++) {
+ md5_hash(result, data, 100);
+ }
+
+ printk("---> tls_con_send ret\n");
+
/*
* We can not send the alert on conn_drop hook, because the hook
* is called on already closed socket.
// trailers.go
package main
import (
"fmt"
"io"
"log"
"net/http"
"strings"
)
func main() {
mux := http.NewServeMux()
str := strings.Repeat("a", 32)
data := strings.Repeat("a", 1024*1024*10)
mux.HandleFunc("/trailers", func(w http.ResponseWriter, req *http.Request) {
w.Header().Set("Trailer", "foobar")
w.Header().Add("Trailer", "X-Token1")
w.Header().Set("Content-Type", "text/html")
w.Header().Set("Server", "Deproxy Server")
w.Header().Set("Foobbxxxx", "Foobbxxxx")
w.WriteHeader(http.StatusOK)
w.Header()["Trailer:foobar"] = []string{"foobar"}
w.Header().Set("X-Token1", "value")
})
log.Fatal(http.ListenAndServe(":8000", mux))
}
listen 443 proto=https,h2;
server 127.0.0.1:8000;
tls_certificate /home/kingluo/tempesta/server.crt;
tls_certificate_key /home/kingluo/tempesta/server.key;
tls_match_any_server_name;
cache 2;
cache_fulfill * *;
#server_forward_timeout 5;
cd tempesta
sudo -E TFW_CFG_PATH=$HOME/tempesta.conf ./scripts/tempesta.sh --restart
go run trailer.go &
curl https://127.0.0.1/trailers --http2 -k --raw -s -v
[ 389.799144] ---> tls_con_send ret
[ 389.961343] ---> tls_con_send ret
[ 390.008512] --> msg_free
[ 390.205745] ---> tls_con_send ret
[ 390.206767] ==================================================================
[ 390.208025] BUG: KASAN: use-after-free in _raw_spin_lock+0x6d/0xd0
[ 390.209337] Write of size 4 at addr ffff888106f758a8 by task swapper/0/0
[ 390.212064] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: G W OE 5.10.35+ #1
[ 390.213460] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 390.214888] Call Trace:
[ 390.216428] <IRQ>
[ 390.217842] dump_stack+0x96/0xc4
[ 390.219261] print_address_description.constprop.0+0x21/0x220
[ 390.220702] ? _raw_spin_lock_irqsave+0x8e/0xf0
[ 390.222136] ? _raw_write_unlock_bh+0x30/0x30
[ 390.223592] ? tfw_http_sess_learn+0x12a/0x530 [tempesta_fw]
[ 390.225033] ? _raw_spin_lock+0x6d/0xd0
[ 390.226473] kasan_report.cold+0x20/0x37
[ 390.227944] ? _raw_spin_lock+0x6d/0xd0
[ 390.229374] check_memory_region+0x14d/0x1a0
[ 390.230848] __kasan_check_write+0x14/0x20
[ 390.232305] _raw_spin_lock+0x6d/0xd0
[ 390.233720] ? _raw_write_lock_irqsave+0xe0/0xe0
[ 390.235173] ? tfw_connection_send+0x59/0x60 [tempesta_fw]
[ 390.236636] tfw_cli_conn_send+0x2a/0x90 [tempesta_fw]
[ 390.238095] tfw_h2_resp_fwd+0x80/0x180 [tempesta_fw]
[ 390.239585] tfw_http_resp_cache_cb+0x8d/0x4f0 [tempesta_fw]
[ 390.241066] tfw_cache_add+0x185/0x380 [tempesta_fw]
[ 390.242549] ? tfw_http_req_cache_cb+0xbc0/0xbc0 [tempesta_fw]
[ 390.244173] ? __cache_add_node+0x1a0/0x1a0 [tempesta_fw]
[ 390.245674] ? _raw_spin_lock+0x7b/0xd0
[ 390.247578] ? frang_http_req_incomplete_hdrs_check.isra.0+0x290/0x290 [tempesta_fw]
[ 390.249148] ? tfw_http_req_zap_error+0x22/0x1c0 [tempesta_fw]
[ 390.250727] tfw_cache_process+0x491/0x620 [tempesta_fw]
[ 390.252320] ? tfw_http_req_cache_cb+0xbc0/0xbc0 [tempesta_fw]
[ 390.253911] ? tfw_wq_tasklet+0x200/0x200 [tempesta_fw]
[ 390.255514] ? tfw_gfsm_move+0xb7/0x300 [tempesta_fw]
[ 390.257106] tfw_http_resp_cache+0x2ff/0x400 [tempesta_fw]
[ 390.258703] ? tfw_http_resp_gfsm+0xf0/0xf0 [tempesta_fw]
[ 390.260368] ? tfw_gfsm_move+0xb7/0x300 [tempesta_fw]
[ 390.261963] tfw_http_msg_process_generic+0x9d9/0xbd0 [tempesta_fw]
[ 390.263500] ? tfw_http_req_process+0x1110/0x1110 [tempesta_fw]
[ 390.264980] ? __alloc_skb+0x272/0x330
[ 390.266447] ? __kasan_check_write+0x14/0x20
[ 390.267932] ? __coalesce_frag+0x136/0x370 [tempesta_fw]
[ 390.269430] tfw_http_msg_process+0x81/0xc0 [tempesta_fw]
[ 390.270940] tfw_connection_recv+0x188/0x240 [tempesta_fw]
[ 390.272461] ? tfw_connection_send+0x60/0x60 [tempesta_fw]
[ 390.274020] ss_tcp_process_data+0x45f/0x840 [tempesta_fw]
[ 390.275519] ? ss_do_close+0x810/0x810 [tempesta_fw]
[ 390.277057] ? __kasan_check_read+0x11/0x20
[ 390.278764] ? tcp_grow_window+0xce/0x330
[ 390.280317] ss_tcp_data_ready+0xa9/0x1b0 [tempesta_fw]
[ 390.281817] tcp_data_ready+0x73/0x210
[ 390.283311] tcp_data_queue+0x1085/0x1d80
[ 390.284766] ? tcp_validate_incoming+0x1fa/0xa30
[ 390.286215] ? tcp_data_ready+0x210/0x210
[ 390.287632] ? tcp_try_keep_open+0x130/0x130
[ 390.288994] ? __ip_finish_output+0x340/0x340
[ 390.290311] ? tempesta_sock_tcp_rcv+0x6a/0x80
[ 390.291646] ? recalibrate_cpu_khz+0x10/0x10
[ 390.293015] ? ktime_get+0x4f/0xb0
[ 390.294363] tcp_rcv_established+0x427/0xf50
[ 390.295697] ? tcp_parse_md5sig_option+0x20/0xb0
[ 390.297014] ? tcp_data_queue+0x1d80/0x1d80
[ 390.298320] ? __kasan_check_read+0x11/0x20
[ 390.299633] tcp_v4_do_rcv+0x25f/0x360
[ 390.300928] tcp_v4_rcv+0x1600/0x1770
[ 390.302219] ? tcp_v4_early_demux+0x2f0/0x2f0
[ 390.303555] ip_protocol_deliver_rcu+0x46/0x2e0
[ 390.304864] ip_local_deliver_finish+0xc6/0xe0
[ 390.306168] ip_local_deliver+0x1f5/0x210
[ 390.307454] ? ip_local_deliver_finish+0xe0/0xe0
[ 390.308904] ? nf_nat_inet_fn+0x2b3/0x440 [nf_nat]
[ 390.310321] ? ip_protocol_deliver_rcu+0x2e0/0x2e0
[ 390.311681] ? ip_rcv_finish_core.constprop.0+0x17c/0x890
[ 390.312965] ip_rcv_finish+0xcf/0xf0
[ 390.314238] ip_rcv+0x16d/0x180
[ 390.315534] ? ip_local_deliver+0x210/0x210
[ 390.316889] ? ip_rcv_finish_core.constprop.0+0x890/0x890
[ 390.318146] ? ttwu_do_activate+0xa4/0x170
[ 390.319399] ? ip_local_deliver+0x210/0x210
[ 390.320605] __netif_receive_skb_one_core+0x132/0x140
[ 390.321811] ? napi_gro_receive+0x260/0x260
[ 390.323008] ? __kasan_check_write+0x14/0x20
[ 390.324254] ? _raw_spin_lock+0x7b/0xd0
[ 390.325473] __netif_receive_skb+0x26/0xb0
[ 390.327116] process_backlog+0xfe/0x290
[ 390.328300] net_rx_action+0x287/0x6b0
[ 390.329475] ? napi_complete_done+0x2c0/0x2c0
[ 390.330623] ? switch_fpu_return+0x120/0x120
[ 390.331770] __do_softirq+0xfe/0x383
[ 390.332889] asm_call_irq_on_stack+0x12/0x20
[ 390.334051] </IRQ>
[ 390.335163] do_softirq_own_stack+0x3d/0x50
[ 390.336296] irq_exit_rcu+0xc7/0x120
[ 390.337410] sysvec_call_function_single+0x3a/0x90
[ 390.338525] asm_sysvec_call_function_single+0x12/0x20
[ 390.339789] RIP: 0010:native_safe_halt+0xe/0x10
[ 390.340919] Code: 49 8b 04 24 a8 08 0f 85 71 ff ff ff e9 0f ff ff ff cc cc cc cc cc cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 74 24 52 00 fb f4 <c3> 90 e9 07 00 00 00
0f 00 2d 64 24 52 00 f4 c3 cc cc 0f 1f 44 00
[ 390.343490] RSP: 0018:ffffffffb4e07dc8 EFLAGS: 00000202
[ 390.344741] RAX: ffffffffb3f36500 RBX: ffffffffb4e1b100 RCX: ffffffffb3f20d75
[ 390.346033] RDX: 00000000008841a2 RSI: 0000000000000004 RDI: ffff888108435040
[ 390.347351] RBP: ffffffffb4e07dd0 R08: 0000000000000001 R09: ffff888108435043
[ 390.348650] R10: ffffed1021086a08 R11: 0000000000000001 R12: 0000000000000000
[ 390.349937] R13: ffffffffb56e3020 R14: 0000000000000000 R15: 0000000000000000
[ 390.351228] ? __cpuidle_text_start+0x8/0x8
[ 390.352518] ? rcu_eqs_enter.constprop.0+0x85/0xa0
[ 390.353809] ? default_idle+0xe/0x20
[ 390.355194] arch_cpu_idle+0x15/0x20
[ 390.356494] default_idle_call+0x66/0x160
[ 390.357772] do_idle+0x36f/0x430
[ 390.359052] ? asm_sysvec_call_function_single+0x12/0x20
[ 390.360635] ? arch_cpu_idle_exit+0x40/0x40
[ 390.362262] cpu_startup_entry+0x20/0x30
[ 390.363638] rest_init+0xc8/0xce
[ 390.364938] arch_call_rest_init+0xe/0x1b
[ 390.366342] start_kernel+0x380/0x39d
[ 390.367675] x86_64_start_reservations+0x24/0x26
[ 390.368985] x86_64_start_kernel+0x7a/0x7e
[ 390.370282] secondary_startup_64_no_verify+0xb0/0xbb
[ 390.372969] Allocated by task 1583:
[ 390.374269] kasan_save_stack+0x23/0x50
[ 390.375568] __kasan_kmalloc.constprop.0+0xcf/0xe0
[ 390.376905] kasan_slab_alloc+0xe/0x10
[ 390.378197] kmem_cache_alloc+0xef/0x230
[ 390.379520] tfw_cli_conn_alloc+0x45/0x140 [tempesta_fw]
[ 390.380831] tfw_sock_clnt_new+0x1a9/0x450 [tempesta_fw]
[ 390.382131] ss_tcp_state_change+0x179/0x310 [tempesta_fw]
[ 390.383427] tcp_rcv_state_process+0x17b7/0x1f50
[ 390.384721] tcp_child_process+0x15c/0x2e0
[ 390.386018] tcp_v4_rcv+0x1179/0x1770
[ 390.387344] ip_protocol_deliver_rcu+0x46/0x2e0
[ 390.388632] ip_local_deliver_finish+0xc6/0xe0
[ 390.389926] ip_local_deliver+0x1f5/0x210
[ 390.391212] ip_rcv_finish+0xcf/0xf0
[ 390.392484] ip_rcv+0x16d/0x180
[ 390.393831] __netif_receive_skb_one_core+0x132/0x140
[ 390.395190] __netif_receive_skb+0x26/0xb0
[ 390.396475] process_backlog+0xfe/0x290
[ 390.397742] net_rx_action+0x287/0x6b0
[ 390.399020] __do_softirq+0xfe/0x383
[ 390.401525] Freed by task 20:
[ 390.402810] kasan_save_stack+0x23/0x50
[ 390.404059] kasan_set_track+0x20/0x30
[ 390.405297] kasan_set_free_info+0x1f/0x40
[ 390.406472] __kasan_slab_free+0x116/0x160
[ 390.407578] kasan_slab_free+0xe/0x10
[ 390.408665] slab_free_freelist_hook+0x6d/0x120
[ 390.409745] kmem_cache_free+0x105/0x410
[ 390.410869] tfw_cli_conn_free+0xe1/0x150 [tempesta_fw]
[ 390.411967] tfw_cli_conn_release+0x89/0xc0 [tempesta_fw]
[ 390.413062] tfw_tls_conn_dtor+0x255/0x310 [tempesta_fw]
[ 390.414166] tfw_sock_clnt_drop+0x211/0x220 [tempesta_fw]
[ 390.415266] ss_conn_drop_guard_exit+0x65/0xc0 [tempesta_fw]
[ 390.416359] tcp_done+0x14d/0x1e0
[ 390.417433] tcp_rcv_state_process+0x16c3/0x1f50
[ 390.418554] tcp_v4_do_rcv+0x164/0x360
[ 390.419648] tcp_v4_rcv+0x1600/0x1770
[ 390.420733] ip_protocol_deliver_rcu+0x46/0x2e0
[ 390.421817] ip_local_deliver_finish+0xc6/0xe0
[ 390.422921] ip_local_deliver+0x1f5/0x210
[ 390.423990] ip_rcv_finish+0xcf/0xf0
[ 390.425053] ip_rcv+0x16d/0x180
[ 390.426112] __netif_receive_skb_one_core+0x132/0x140
[ 390.427220] __netif_receive_skb+0x26/0xb0
[ 390.428323] process_backlog+0xfe/0x290
[ 390.429400] net_rx_action+0x287/0x6b0
[ 390.430464] __do_softirq+0xfe/0x383
[ 390.432576] The buggy address belongs to the object at ffff888106f75440
which belongs to the cache tfw_h2_conn_cache of size 2568
[ 390.434882] The buggy address is located 1128 bytes inside of
2568-byte region [ffff888106f75440, ffff888106f75e48)
[ 390.437144] The buggy address belongs to the page:
[ 390.438272] page:0000000019f7ee9c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x106f70
[ 390.439477] head:0000000019f7ee9c order:3 compound_mapcount:0 compound_pincount:0
[ 390.440693] flags: 0x17ffffc0010200(slab|head)
[ 390.441888] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff888101e72f00
[ 390.443145] raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
[ 390.444437] page dumped because: kasan: bad access detected
[ 390.446958] Memory state around the buggy address:
[ 390.448225] ffff888106f75780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 390.449578] ffff888106f75800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 390.450873] >ffff888106f75880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 390.452152] ^
[ 390.453469] ffff888106f75900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 390.454785] ffff888106f75980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 390.456096] ==================================================================
[ 390.457415] Disabling lock debugging due to kernel taint
Save msg->ss_flags
for later access, in case of msg deallocation.
diff --git a/fw/tls.c b/fw/tls.c
index 51d9ebcb..8d1d484c 100644
--- a/fw/tls.c
+++ b/fw/tls.c
@@ -797,6 +797,7 @@ tfw_tls_conn_send(TfwConn *c, TfwMsg *msg)
{
int r;
TlsCtx *tls = tfw_tls_context(c);
+ int ss_flags = msg->ss_flags;
T_DBG("TLS %lu bytes (%u bytes)"
" are to be sent on conn=%pK/sk_write_xmit=%pK ready=%d\n",
@@ -817,8 +818,8 @@ tfw_tls_conn_send(TfwConn *c, TfwMsg *msg)
* We can not send the alert on conn_drop hook, because the hook
* is called on already closed socket.
*/
- if (msg->ss_flags & SS_F_CONN_CLOSE) {
- int close_type = msg->ss_flags & __SS_F_FORCE ?
+ if (ss_flags & SS_F_CONN_CLOSE) {
+ int close_type = ss_flags & __SS_F_FORCE ?
TTLS_F_ST_CLOSE : TTLS_F_ST_SHUTDOWN;
spin_lock(&tls->lock);
@krizhanovsky I'm afraid the story isn't over yet. Since we've assumed that if ss_send
returns without error, the message is valid to access later (either from tfw_tls_conn_send
or its caller, e.g. tfw_cli_conn_send
), then we have another problem: cli_conn
may also be invalid if:
msg
is sent successfully and gets released like in this issue, which will reduce the cli_conn
counter.msg
is the last request for this cli_conn
keepalive_timer_cb
or TCP FIN/RST from the peer after 1.cli_conn
is accessed again.If this relevant bug is true, then we need to do a tfw_connection_ge
before the msg
goes into ss_send
, e.g.
static int
tfw_tls_conn_send(TfwConn *c, TfwMsg *msg)
{
int r;
TlsCtx *tls = tfw_tls_context(c);
T_DBG("TLS %lu bytes (%u bytes)"
" are to be sent on conn=%pK/sk_write_xmit=%pK ready=%d\n",
msg->len, tls->io_out.msglen + TLS_HEADER_SIZE, c,
c->sk->sk_write_xmit, ttls_xfrm_ready(tls));
if (ttls_xfrm_ready(tls)) {
msg->ss_flags |= SS_SKB_TYPE2F(TTLS_MSG_APPLICATION_DATA) |
SS_F_ENCRYPT;
}
// <------------------- must guard `cli_conn` here------------------------------
r = ss_send(c->sk, &msg->skb_head,
msg->ss_flags & ~SS_F_CLOSE_FORCE);
if (r)
return r;
/*
* We can not send the alert on conn_drop hook, because the hook
* is called on already closed socket.
*/
if (msg->ss_flags & SS_F_CONN_CLOSE) {
int close_type = msg->ss_flags & __SS_F_FORCE ?
TTLS_F_ST_CLOSE : TTLS_F_ST_SHUTDOWN;
spin_lock(&tls->lock); // <------------- `cli_conn` access
r = ttls_close_notify(tls, close_type);
spin_unlock(&tls->lock);
}
return r;
}
as well as the callers, which needs to access cli_conn
after tfw_tls_conn_send
, e.g. tfw_cli_conn_send
:
int
tfw_cli_conn_send(TfwCliConn *cli_conn, TfwMsg *msg)
{
int r;
// <------------------- must guard `cli_conn` here------------------------------
r = tfw_connection_send((TfwConn *)cli_conn, msg);
/*
* The lock is needed because the timer deletion was moved from release() to
* drop(). While release() is called when there are no other users, there is
* no such luxury with drop() and the connection can still be used due to
* lingering threads.
*/
spin_lock(&cli_conn->timer_lock); // <------------- `cli_conn` access
if (timer_pending(&cli_conn->timer))
mod_timer(&cli_conn->timer,
jiffies +
msecs_to_jiffies((long)tfw_cli_cfg_ka_timeout * 1000));
spin_unlock(&cli_conn->timer_lock);
Functions in question:
tfw_h2_resp_fwd (m) ----> tfw_cli_conn_send (c) -> tfw_tls_conn_send (m/c)
__tfw_http_resp_fw (m) ---/
m
denotes the function uses msg
after successful ss_send
c
denotes the function uses cli_conn
after successful ss_send
.As discussed, these derivative issues will be fixed at https://github.com/tempesta-tech/tempesta/pull/2182.
I look on this functions tfw_cli_conn_send
and think we have no problem with function:
tfw_cli_conn_send
is called from tfw_h2_resp_fwd
where we get connection before calling tfw_cli_conn_send
tfw_cli_conn_send
is called from __tfw_http_resp_fwd
which is called only for http1, where we have no skb->destructors, so
request is always get connection alive.May be we have problem with tfw_tls_conn_send
because this function is called from tfw_connection_send
in many places.
What will be if we receive FIN when we call this function. It seems that we rely on the fact that we process FIN on the same cpu under the socket lock as we call tfw_connection_send
(@krizhanovsky please look it also carefully). The only place when we don't do it is tfw_h2_error_resp
when we call it when we fail to process RESPONSE, because we process response from backend on ohter cpu under the lock of server socket!
@EvgeniiMekhanik @kingluo There are only 4 places where tfw_connection_send()
is called:
$ grep 'tfw_connection_send(' *.[ch]
connection.c:tfw_connection_send(TfwConn *conn, TfwMsg *msg)
connection.h:int tfw_connection_send(TfwConn *conn, TfwMsg *msg);
http.c: if (!(r = tfw_connection_send((TfwConn *)srv_conn, (TfwMsg *)req)))
http.c: * requests in @fwd_queue. Typically tfw_connection_send() or its pure
http_frame.c: if ((r = tfw_connection_send((TfwConn *)conn, &msg)))
sock_clnt.c: r = tfw_connection_send((TfwConn *)cli_conn, msg);
websocket.c: if ((r = tfw_connection_send(conn->pair, &msg))) {
and it seems we really have an issue, e.g. in __tfw_http_resp_fwd()
, which calls tfw_http_resp_pair_free()
after tfw_cli_conn_send()
- in my understanding there could be double freeing.
I'd propose to fix this with either:
TfwMsg
and once ss_send()
is called, i.e. the message is passed to another context, the refcounter should be increasedss_send()
.It seems the issue was wrongly closed with the PR merge, so I reopen it.
@krizhanovsky
and it seems we really have an issue, e.g. in __tfw_http_resp_fwd(), which calls tfw_http_resp_pair_free() after tfw_cli_conn_send() - in my understanding there could be double freeing.
No, it will release msg
only when tfw_cli_conn_send
fails, which is fine. As I said, the only bug of msg
usage is referencing msg
after ss_send
succeeds (and its caller's behavior in handling msg
).
For http1 we don't destroy message in other context. We do it only for http2
Let me clarfiy my opinions again, because I found it seems a messy about our discussions.
We have two sort of bugs:
m
: msg
use-after-freec
: if msg
gets freed, it will decrease its associated connection too, so chances are that the connection already gets freed before we use it again, please see the conditions above: https://github.com/tempesta-tech/tempesta/issues/2169#issuecomment-2249686288All in all, we have a graph here:
tfw_h2_resp_fwd (m) ----> tfw_cli_conn_send (c?) -> tfw_tls_conn_send (m/c)
__tfw_http_resp_fw (m) ---/
So I suggest all msg
related bugs are fixed here, while cli_conn
related bugs are fixed in https://github.com/tempesta-tech/tempesta/pull/2182.
c?
: https://github.com/tempesta-tech/tempesta/pull/2182#issuecomment-2255705530
@krizhanovsky yes, as the graph shows, there still are two suspicious bugs to fix, but since they're side issues (or in other words, similar issues) that are beyond this issue, that's why I closed this issue by the PR. I'll make sure to fix them later if you think it's necessary.
void
tfw_h2_resp_fwd(TfwHttpResp *resp)
{
bool resp_in_xmit =
(TFW_SKB_CB(resp->msg.skb_head)->opaque_data == resp);
TfwHttpReq *req = resp->req;
TfwConn *conn = req->conn;
tfw_connection_get(conn);
do_access_log(resp);
if (tfw_cli_conn_send((TfwCliConn *)conn, (TfwMsg *)resp)) {
T_DBG("%s: cannot send data to client via HTTP/2\n", __func__);
TFW_INC_STAT_BH(serv.msgs_otherr);
tfw_connection_close(conn, true);
/* We can't send response, so we should free it here. */
resp_in_xmit = false;
} else {
// <----------------------- refer to `resp` after succeed
TFW_INC_STAT_BH(serv.msgs_forwarded);
tfw_inc_global_hm_stats(resp->status);
}
static void
__tfw_http_resp_fwd(TfwCliConn *cli_conn, struct list_head *ret_queue)
{
TfwHttpReq *req, *tmp;
list_for_each_entry_safe(req, tmp, ret_queue, msg.seq_list) {
BUG_ON(!req->resp);
tfw_http_resp_init_ss_flags(req->resp);
if (tfw_cli_conn_send(cli_conn, (TfwMsg *)req->resp)) {
tfw_connection_close((TfwConn *)cli_conn, true);
return;
}
// <----------------------- refer to `req` and `resp` after succeed
TFW_INC_STAT_BH(serv.msgs_forwarded);
tfw_inc_global_hm_stats(req->resp->status);
list_del_init(&req->msg.seq_list);
tfw_http_resp_pair_free(req);
}
}
@kingluo OK, thank you for the explanation. Did I understand correctly that you propose to close the issue since the msg
bug is fixed in #2180 and leave connections fixes for #2182. Also since you approved #2182, the connections issue is also gone in the PR? If everything correct, then feel free to close the issue.
@kingluo OK, thank you for the explanation. Did I understand correctly that you propose to close the issue since the
msg
bug is fixed in #2180 and leave connections fixes for #2182. Also since you approved #2182, the connections issue is also gone in the PR? If everything correct, then feel free to close the issue.
yes, no cli_conn
issues will be fixed here, but the above side issues of msg
can be fixed here by the way, if necessary.
Basically if we have some problem, then we must have an appropriate issue for this. #2180 is merged and we can leave the issue for the rest of the side bugs.
Run Tempesta FW in a 4 CPU VM with 8GB RAM of version #2162. Run
taskset --cpu-list 6,7,8,9,10,11,12,13 h2load -n 1000000 -c 1024 -t 8 -m 50 https://tempesta-tech.com
and on 60% of requests I got the KASAN report:The system hangs.
The kernel (5.10.35 as of09f6e2a043d099627d0617dbe34c2d1521540927 ) config is in the attachment.
tfw_tls_conn_send+0x66
corresponds tolinux-config.txt
Our website LXC container is used as a backend and Tempesta FW config is