Closed EvgeniiMekhanik closed 2 months ago
https://github.com/tempesta-tech/tempesta/tree/MekhanikEvgenii/2183-dbg this is a branch to reproduce crash with debug info. It seems that there are a lot of bugs which can be reproduces by t_frang.test_config.TestOverridingInheritance.test_default_not_override_http_resp_code_block_2 test.
[ 1646.358134] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 1646.359559] tfw_sock_srv_connect_drop: conn ffffa067fb2f9260 refcnt before put -1073741824 | 1 | is_client 0
[ 1646.360311] #PF: supervisor read access in kernel mode
[ 1646.423105] #PF: error_code(0x0000) - not-present page
[ 1646.423524] PGD 0 P4D 0
[ 1646.423738] Oops: 0000 [#1] SMP NOPTI
[ 1646.424040] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE 5.10.35+ #298
[ 1646.424682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 1646.425473] RIP: 0010:tfw_http_cli_error_resp_and_log.cold+0x1f5/0x288 [tempesta_fw]
[ 1646.426102] Code: 00 00 00 44 89 e2 0f b6 4d 87 8b 75 8c 4c 89 ff 83 f2 01 0f b6 d2 e8 c6 5e f7 ff 4d 8b a6 a0 00 00 00 65 44 8b 05 a7 e6 e9 3e <45> 8b 4c 24 08 41 81 e1 00 01 00 00 4d 85 e4 74 66 41 8b 4c 24 40
[ 1646.427617] RSP: 0018:ffffb5cfc0003950 EFLAGS: 00010246
[ 1646.428044] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 1646.428622] RDX: 0000000000000000 RSI: ffffa068b7c18a40 RDI: ffffa068b7c18a40
[ 1646.429201] RBP: ffffb5cfc00039e0 R08: 0000000000000000 R09: 6e6f635f7672735f
[ 1646.429783] R10: 6f72645f7463656e R11: 206e6e6f63203a70 R12: 0000000000000000
[ 1646.430360] R13: 0000000000000193 R14: ffffa067fb2df020 R15: ffffa067fdd61c40
[ 1646.430959] FS: 0000000000000000(0000) GS:ffffa068b7c00000(0000) knlGS:0000000000000000
[ 1646.431615] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1646.432082] CR2: 0000000000000008 CR3: 00000001aa21e000 CR4: 0000000000750ef0
[ 1646.432663] PKRU: 55555554
[ 1646.432890] Call Trace:
[ 1646.433096] <IRQ>
[ 1646.433275] tfw_http_resp_cache+0x155/0x1c0 [tempesta_fw]
[ 1646.433729] tfw_http_msg_process_generic+0x54d/0x6e0 [tempesta_fw]
[ 1646.434247] ? __coalesce_frag+0x99/0x150 [tempesta_fw]
[ 1646.434686] tfw_http_msg_process+0x3e/0x60 [tempesta_fw]
[ 1646.435130] tfw_connection_recv+0xbb/0x140 [tempesta_fw]
[ 1646.435582] ? ss_skb_unroll+0xce/0x280 [tempesta_fw]
[ 1646.436001] ss_tcp_process_data+0x20a/0x4b0 [tempesta_fw]
[ 1646.436469] ss_tcp_data_ready+0x57/0x140 [tempesta_fw]
[ 1646.436900] tcp_data_ready+0x2b/0xd0
[ 1646.437204] tcp_data_queue+0x805/0xe50
[ 1646.437521] tcp_rcv_established+0x254/0x910
[ 1646.437872] tcp_v4_do_rcv+0x140/0x200
[ 1646.438186] tcp_v4_rcv+0xcd0/0xe20
[ 1646.438488] ip_protocol_deliver_rcu+0x44/0x230
[ 1646.438859] ip_local_deliver_finish+0x48/0x60
[ 1646.439229] ip_local_deliver+0xf8/0x110
[ 1646.439553] ? ip_protocol_deliver_rcu+0x230/0x230
[ 1646.439945] ip_rcv_finish+0x87/0xa0
[ 1646.440240] ip_rcv+0xce/0xe0
[ 1646.440492] ? ip_rcv_finish_core.constprop.0+0x470/0x470
[ 1646.440935] __netif_receive_skb_one_core+0x86/0xa0
[ 1646.441334] __netif_receive_skb+0x18/0x60
[ 1646.441676] process_backlog+0x9e/0x170
[ 1646.441992] net_rx_action+0x13b/0x430
[ 1646.442310] ? try_to_wake_up+0x218/0x6c0
[ 1646.442662] __do_softirq+0xe3/0x340
[ 1646.442958] asm_call_irq_on_stack+0xf/0x20
[ 1646.443301] </IRQ>
[ 1646.443480] do_softirq_own_stack+0x3d/0x50
[ 1646.443824] irq_exit_rcu+0xa2/0xe0
[ 1646.444114] sysvec_call_function_single+0x3d/0x90
[ 1646.444510] asm_sysvec_call_function_single+0x12/0x20
[ 1646.444930] RIP: 0010:native_safe_halt+0xe/0x10
[ 1646.445302] Code: 39 ff ff ff 4c 89 ee 48 c7 c7 40 ba a5 9d e8 d9 6a 93 ff e9 01 ff ff ff cc cc cc cc e9 07 00 00 00 0f 00 2d 16 ac 49 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d 06 ac 49 00 f4 c3 cc cc 0f 1f 44 00
[ 1646.446811] RSP: 0018:ffffffff9da03df0 EFLAGS: 00000206
[ 1646.447243] RAX: ffffffff9cb70bd0 RBX: 0000000000000000 RCX: ffffa068b7c2cdc0
[ 1646.447820] RDX: 000000000048f026 RSI: ffffffff9da03d90 RDI: 0000017f529f6d18
[ 1646.448397] RBP: ffffffff9da03df8 R08: 0000000000000001 R09: 0000017f4fb27df7
[ 1646.448993] R10: 0000000000000000 R11: 000000000002c040 R12: ffffffff9da1bb80
[ 1646.449570] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000092
[ 1646.450148] ? __sched_text_end+0x4/0x4
[ 1646.450476] ? default_idle+0xe/0x20
[ 1646.450772] arch_cpu_idle+0x15/0x20
[ 1646.451067] default_idle_call+0x3d/0xc0
[ 1646.451390] do_idle+0x215/0x2a0
[ 1646.451658] cpu_startup_entry+0x20/0x30
[ 1646.451981] rest_init+0xcc/0xee
[ 1646.452250] ? acpi_enable_subsystem+0x20b/0x213
[ 1646.452634] arch_call_rest_init+0xe/0x1b
[ 1646.452964] start_kernel+0x8a5/0x8cb
[ 1646.453266] x86_64_start_reservations+0x24/0x26
[ 1646.453643] x86_64_start_kernel+0xf0/0xf7
[ 1646.453981] secondary_startup_64_no_verify+0xc2/0xcb
[ 1646.454394] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) tls sha256_ssse3 sha512_ssse3 md4 cmac nls_utf8 cifs libarc4 fscache libdes veth nft_masq nvme_tcp nvme_fabrics nvme_core vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay binfmt_misc nls_iso8859_1 virtio_gpu virtio_dma_buf kvm_intel drm_kms_helper kvm cec rc_core fb_sys_fops syscopyarea crct10dif_pclmul ghash_clmulni_intel sysfillrect aesni_intel sysimgblt crypto_simd joydev cryptd glue_helper input_leds mac_hid sch_fq_codel serio_raw qemu_fw_cfg msr parport_pc ppdev lp drm parport ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c psmouse e1000 crc32_pclmul floppy pata_acpi
[ 1646.454446] i2c_piix4 [last unloaded: tempesta_lib]
[ 1646.461850] CR2: 0000000000000008
[ 1646.462127] ---[ end trace 63500d0fcf9291fc ]---
[ 1646.462524] RIP: 0010:tfw_http_cli_error_resp_and_log.cold+0x1f5/0x288 [tempesta_fw]
[ 1646.463152] Code: 00 00 00 44 89 e2 0f b6 4d 87 8b 75 8c 4c 89 ff 83 f2 01 0f b6 d2 e8 c6 5e f7 ff 4d 8b a6 a0 00 00 00 65 44 8b 05 a7 e6 e9 3e <45> 8b 4c 24 08 41 81 e1 00 01 00 00 4d 85 e4 74 66 41 8b 4c 24 40
[ 1646.464655] RSP: 0018:ffffb5cfc0003950 EFLAGS: 00010246
[ 1646.465083] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 1646.465661] RDX: 0000000000000000 RSI: ffffa068b7c18a40 RDI: ffffa068b7c18a40
[ 1646.466239] RBP: ffffb5cfc00039e0 R08: 0000000000000000 R09: 6e6f635f7672735f
[ 1646.466844] R10: 6f72645f7463656e R11: 206e6e6f63203a70 R12: 0000000000000000
[ 1646.467427] R13: 0000000000000193 R14: ffffa067fb2df020 R15: ffffa067fdd61c40
[ 1646.468006] FS: 0000000000000000(0000) GS:ffffa068b7c00000(0000) knlGS:0000000000000000
[ 1646.468660] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1646.469128] CR2: 0000000000000008 CR3: 00000001aa21e000 CR4: 0000000000750ef0
[ 1646.469707] PKRU: 55555554
It seems we have a problem at least at three places.
Also I catch
[ 922.766862] BUG tfw_h2_conn_cache (Tainted: G B OE ): Objects remaining in tfw_h2_conn_cache on __kmem_cache_shutdown()
[ 922.769314] -----------------------------------------------------------------------------
[ 922.769314]
[ 922.770810] INFO: Slab 0x00000000224d306d objects=12 used=1 fp=0x0000000022150e17 flags=0x17ffffc0010200
[ 922.772057] CPU: 1 PID: 73625 Comm: rmmod Tainted: G B OE 5.10.35+ #298
[ 922.772902] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 922.773862] Call Trace:
[ 922.774111] dump_stack+0x70/0x8b
[ 922.774409] slab_err+0xb7/0xdc
[ 922.774667] ? slub_cpu_dead+0x120/0x120
[ 922.774979] ? slub_cpu_dead+0x120/0x120
[ 922.775308] ? on_each_cpu_cond_mask+0x48/0x90
[ 922.775675] __kmem_cache_shutdown+0x165/0x330
[ 922.776038] kmem_cache_destroy+0x59/0x110
[ 922.776380] tfw_sock_clnt_exit+0x21/0x40 [tempesta_fw]
[ 922.776818] tfw_exit+0xd7/0xea [tempesta_fw]
[ 922.777161] __do_sys_delete_module.constprop.0+0x181/0x2f0
[ 922.777605] ? exit_to_user_mode_prepare+0x35/0x150
[ 922.777985] __x64_sys_delete_module+0x12/0x20
[ 922.778350] do_syscall_64+0x38/0x90
[ 922.778632] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 922.779039] RIP: 0033:0x7fda1e0e8aeb
[ 922.779321] Code: 73 01 c3 48 8b 0d 45 33 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 15 33 0f 00 f7 d8 64 89 01 48
[ 922.780906] RSP: 002b:00007fff88004288 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 922.781555] RAX: ffffffffffffffda RBX: 000055cdf6506770 RCX: 00007fda1e0e8aeb
[ 922.782175] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055cdf65067d8
[ 922.782773] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 922.783374] R10: 00007fda1e180ac0 R11: 0000000000000206 R12: 00007fff880044e0
[ 922.783914] R13: 000055cdf65062a0 R14: 00007fff8800669e R15: 000055cdf6506770
[ 922.784475] INFO: Object 0x00000000733c6f5e @offset=17976
[ 922.784860] kmem_cache_destroy tfw_h2_conn_cache: Slab cache still has objects
[ 922.785652] CPU: 1 PID: 73625 Comm: rmmod Tainted: G B OE 5.10.35+ #298
[ 922.786327] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 922.787036] Call Trace:
[ 922.787280] dump_stack+0x70/0x8b
[ 922.787560] kmem_cache_destroy+0x107/0x110
[ 922.787946] tfw_sock_clnt_exit+0x21/0x40 [tempesta_fw]
[ 922.788427] tfw_exit+0xd7/0xea [tempesta_fw]
[ 922.788792] __do_sys_delete_module.constprop.0+0x181/0x2f0
[ 922.789286] ? exit_to_user_mode_prepare+0x35/0x150
[ 922.789696] __x64_sys_delete_module+0x12/0x20
[ 922.790051] do_syscall_64+0x38/0x90
[ 922.790359] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 922.790762] RIP: 0033:0x7fda1e0e8aeb
[ 922.791073] Code: 73 01 c3 48 8b 0d 45 33 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 15 33 0f 00 f7 d8 64 89 01 48
[ 922.792700] RSP: 002b:00007fff88004288 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 922.793361] RAX: ffffffffffffffda RBX: 000055cdf6506770 RCX: 00007fda1e0e8aeb
[ 922.793955] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055cdf65067d8
[ 922.794595] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 922.795225] R10: 00007fda1e180ac0 R11: 0000000000000206 R12: 00007fff880044e0
[ 922.795874] R13: 000055cdf65062a0 R14: 00007fff8800669e R15: 000055cdf6506770
[ 922.850320] [tdb] Shutdown Tempesta DB
During my work on https://github.com/tempesta-tech/tempesta/issues/2178 I found a new BUG.
kernel BUG at /home/evgeny/workdir/tempesta/fw/connection.h:487 [ 832.096108] invalid opcode: 0000 [#1] SMP NOPTI [ 832.096109] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G OE 5.10.35+ #298 [ 832.096110] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 832.096124] RIP: 0010:tfw_sock_srv_connect_drop+0x1a3/0x1e0 [tempesta_fw]
Step to reproduce: run test from https://github.com/tempesta-tech/tempesta-test/commit/1f7290ebab8fcfc7701583bad7c90706bb9345ee http_resp_code_block_2 a lot of time. May be it is necessary to add a lot of debug message to catch this bug.