oracle / linux-uek

Oracle Linux UEK: Unbreakable Enterprise Kernel
https://blogs.oracle.com/linuxkernel
311 stars 69 forks source link

WARNING at net/core/stream.c since 5.15.0-100.96.32.el9uek.x86_64 and 5.4.17-2136.318.7.1.el7uek.x86_64 #15

Closed okmikel closed 9 months ago

okmikel commented 1 year ago

I have 2 machines, one with OL7 and one with OL9. They act as gateway to the Internet and do IPV4 and IPV6 NAT.

On these 2 machines I have massive WARNINGs (every few seconds) since 5.15.0-100.96.32.el9uek.x86_64 (on OL9) and 5.4.17-2136.318.7.1.el7uek.x86_64 (on OL7).

On other machines, which do no NAT, I haven't seen these WARNINGs. Going back to kernels 5.15.0-8.91.4.1.el9uek.x86_64 on OL9 and 5.4.17-2136.317.5.3.el7uek.x86_64 on OL7 fixes the problem.

Here are the WARNINGs form OL7:

Apr 18 08:58:32 stahl2 kernel: [ 236.276791] ------------[ cut here ]------------ Apr 18 08:58:32 stahl2 kernel: [ 236.278671] WARNING: CPU: 5 PID: 0 at net/core/stream.c:212 sk_stream_kill_queues+0xc5/0xd2 Apr 18 08:58:32 stahl2 kernel: [ 236.280615] Modules linked in: xt_multiport sctp pppoe tun pppox ppp_synctty ppp_async ppp_generic slhc ip6table_nat nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6table_mangle ip6_tables xt_nat iptable_nat nf_nat nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT nf_reject_ipv4 xt_pkttype xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter xt_TCPMSS iptable_mangle iptable_raw bochs_drm drm_vram_helper ttm drm_kms_helper drm nfsd syscopyarea pcspkr sysfillrect joydev sysimgblt qemu_fw_cfg virtio_balloon i6300esb i2c_piix4 fb_sys_fops auth_rpcgss nfs_acl lockd grace sunrpc nfs_ssc ip_tables ext4 mbcache jbd2 virtio_net net_failover virtio_blk failover ata_generic pata_acpi ata_piix libata virtio_pci serio_raw virtio_pci_legacy_dev virtio_pci_modern_dev Apr 18 08:58:32 stahl2 kernel: [ 236.289602] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 5.4.17-2136.318.7.1.el7uek.x86_64 #2 Apr 18 08:58:32 stahl2 kernel: [ 236.290896] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 Apr 18 08:58:32 stahl2 kernel: [ 236.292308] RIP: 0010:sk_stream_kill_queues+0xc5/0xd2 Apr 18 08:58:32 stahl2 kernel: [ 236.293589] Code: 0e 48 89 df e8 9c f7 fe ff 8b b3 00 01 00 00 8b 83 48 01 00 00 85 c0 75 15 85 f6 75 0d 5b 41 5c 5d c3 cc cc cc cc 0f 0b eb bb <0f> 0b eb ef 0f 0b 0f 1f 44 00 00 eb e2 0f 1f 40 00 66 2e 0f 1f 84 Apr 18 08:58:32 stahl2 kernel: [ 236.296487] RSP: 0018:ffffb38b00180968 EFLAGS: 00010206 Apr 18 08:58:32 stahl2 kernel: [ 236.298069] RAX: 0000000000000000 RBX: ffff96dce98bc600 RCX: 0000000000000007 Apr 18 08:58:32 stahl2 kernel: [ 236.299451] RDX: 0000000000000001 RSI: 0000000000000d00 RDI: 0000000000000246 Apr 18 08:58:32 stahl2 kernel: [ 236.300986] RBP: ffffb38b00180978 R08: ffff96dcefd217a4 R09: 0000000000000004 Apr 18 08:58:32 stahl2 kernel: [ 236.302365] R10: 0000000000000000 R11: 00000000c7684925 R12: ffff96dce98bc6d0 Apr 18 08:58:32 stahl2 kernel: [ 236.303829] R13: ffff96dcdf0de476 R14: 0000000000000000 R15: ffff96dcdf0de476 Apr 18 08:58:32 stahl2 kernel: [ 236.305298] FS: 0000000000000000(0000) GS:ffff96dcf9d40000(0000) knlGS:0000000000000000 Apr 18 08:58:32 stahl2 kernel: [ 236.306850] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 18 08:58:32 stahl2 kernel: [ 236.308471] CR2: 00007f27f727c000 CR3: 00000001a5adc000 CR4: 00000000000006e0 Apr 18 08:58:32 stahl2 kernel: [ 236.309943] Call Trace: Apr 18 08:58:32 stahl2 kernel: [ 236.311350] Apr 18 08:58:32 stahl2 kernel: [ 236.312821] inet_csk_destroy_sock+0x59/0x13f Apr 18 08:58:32 stahl2 kernel: [ 236.314321] tcp_done+0x8a/0x111 Apr 18 08:58:32 stahl2 kernel: [ 236.315816] tcp_rcv_state_process+0x93b/0xdcc Apr 18 08:58:32 stahl2 kernel: [ 236.317407] ? skb_clone+0x2e/0x125 Apr 18 08:58:32 stahl2 kernel: [ 236.318919] tcp_v6_do_rcv+0x1b9/0x430 Apr 18 08:58:32 stahl2 kernel: [ 236.320375] tcp_v6_rcv+0x9ba/0xa25 Apr 18 08:58:32 stahl2 kernel: [ 236.321817] ? nf_confirm+0xb6/0x110 [nf_conntrack] Apr 18 08:58:32 stahl2 kernel: [ 236.323301] ip6_protocol_deliver_rcu+0xd2/0x4c9 Apr 18 08:58:32 stahl2 kernel: [ 236.324716] ip6_input+0x41/0xb8 Apr 18 08:58:32 stahl2 kernel: [ 236.326090] ? ip6_protocol_deliver_rcu+0x4d0/0x4c9 Apr 18 08:58:32 stahl2 kernel: [ 236.327403] ip6_sublist_rcv_finish+0x59/0x70 Apr 18 08:58:32 stahl2 kernel: [ 236.328714] ip6_sublist_rcv+0x14a/0x1c9 Apr 18 08:58:32 stahl2 kernel: [ 236.330046] ? ip6_sublist_rcv+0x1d0/0x1c9 Apr 18 08:58:32 stahl2 kernel: [ 236.331355] ipv6_list_rcv+0x146/0x16d Apr 18 08:58:32 stahl2 kernel: [ 236.332648] netif_receive_skb_list_core+0x1ad/0x2e5 Apr 18 08:58:32 stahl2 kernel: [ 236.333969] netif_receive_skb_list_internal+0x1ca/0x2dc Apr 18 08:58:32 stahl2 kernel: [ 236.335280] gro_normal_list.part.135+0x1e/0x3f Apr 18 08:58:32 stahl2 kernel: [ 236.336594] napi_complete_done+0xd1/0x117 Apr 18 08:58:32 stahl2 kernel: [ 236.338153] virtnet_poll+0x363/0x480 [virtio_net] Apr 18 08:58:32 stahl2 kernel: [ 236.339545] net_rx_action+0x28d/0x3f7 Apr 18 08:58:32 stahl2 kernel: [ 236.341018] __do_softirq+0xe1/0x2bc Apr 18 08:58:32 stahl2 kernel: [ 236.342467] irq_exit+0xf5/0xfa Apr 18 08:58:32 stahl2 kernel: [ 236.344039] do_IRQ+0x5a/0xe8 Apr 18 08:58:32 stahl2 kernel: [ 236.345424] common_interrupt+0xf/0x1d2 Apr 18 08:58:32 stahl2 kernel: [ 236.346823] Apr 18 08:58:32 stahl2 kernel: [ 236.348283] RIP: 0010:native_safe_halt+0x12/0x18 Apr 18 08:58:32 stahl2 kernel: [ 236.349654] Code: 80 48 02 20 48 8b 00 a8 08 0f 84 60 ff ff ff eb b6 cc cc cc cc cc cc cc 55 48 89 e5 e9 07 00 00 00 0f 00 2d 92 6c 5a 00 fb f4 <5d> c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 55 48 89 e5 e9 07 00 00 Apr 18 08:58:32 stahl2 kernel: [ 236.352557] RSP: 0018:ffffb38b0009be68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda Apr 18 08:58:32 stahl2 kernel: [ 236.354185] RAX: ffffffff93c65cd0 RBX: ffff96dc47e396c0 RCX: 0000000000000001 Apr 18 08:58:32 stahl2 kernel: [ 236.355483] RDX: ffff96dcf9d71140 RSI: ffffb38b0009be50 RDI: 0000000000000000 Apr 18 08:58:32 stahl2 kernel: [ 236.356736] RBP: ffffb38b0009be68 R08: 00000000de9bd37a R09: ffffb38b00b6be98 Apr 18 08:58:32 stahl2 kernel: [ 236.358063] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000005 Apr 18 08:58:32 stahl2 kernel: [ 236.359286] R13: 0000000000000000 R14: 0000000000000000 R15: ffff96dc47e396c0 Apr 18 08:58:32 stahl2 kernel: [ 236.360479] ? __sched_text_end+0x7/0x0 Apr 18 08:58:32 stahl2 kernel: [ 236.361709] default_idle+0x20/0x155 Apr 18 08:58:32 stahl2 kernel: [ 236.363034] arch_cpu_idle+0x15/0x1b Apr 18 08:58:32 stahl2 kernel: [ 236.364197] default_idle_call+0x23/0x36 Apr 18 08:58:32 stahl2 kernel: [ 236.365412] do_idle+0x1a5/0x275 Apr 18 08:58:32 stahl2 kernel: [ 236.366447] cpu_startup_entry+0x1d/0x22 Apr 18 08:58:32 stahl2 kernel: [ 236.367477] start_secondary+0x176/0x1cc Apr 18 08:58:32 stahl2 kernel: [ 236.368684] secondary_startup_64+0xb6/0xb6 Apr 18 08:58:32 stahl2 kernel: [ 236.369792] ---[ end trace 35af283ba7f2cd94 ]--

And here form OL9:

Apr 5 11:19:14 han kernel: ------------[ cut here ]------------ Apr 5 11:19:14 han kernel: WARNING: CPU: 2 PID: 0 at net/core/stream.c:212 sk_stream_kill_queues+0xd7/0xec Apr 5 11:19:14 han kernel: Modules linked in: uas usb_storage tls tun lz4hc lz4hc_compress xt_multiport pppoe pppox ppp_generic bluetooth slhc ecdh_generic ecc 8021q garp mrp rfkill nft_chain_nat xt_MASQUERADE xt_nat ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_LOG nf_log_syslog ip_tables ip6_tables xt_pkttype xt_mark xt_connmark xt_conntrack xt_TCPMSS nft_counter xt_CT nft_compat nct6775 hwmon_vid snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec intel_rapl_msr intel_rapl_common snd_hda_core snd_hwdep edac_mce_amd snd_seq kvm_amd snd_seq_device snd_pcm kvm snd_timer snd irqbypass wmi_bmof pcspkr soundcore joydev i2c_piix4 k10temp gpio_amdpt gpio_generic acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sch_fq_codel sunrpc fuse ext4 mbcache jbd2 amdgpu crct10dif_pclmul crc32_pclmul sd_mod t10_pi drm_ttm_helper ttm sg gpu_sched i2c_algo_bit ghash_clmulni_intel drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec aesni_intel ahci libahci Apr 5 11:19:14 han kernel: crypto_simd cryptd r8169 drm sp5100_tco ccp libata realtek wmi video pinctrl_amd dm_mod nf_nat_sip nf_nat nf_conntrack_sip nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 Apr 5 11:19:14 han kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G W 5.15.0-100.96.32.el9uek.x86_64 #2 Apr 5 11:19:14 han kernel: Hardware name: To Be Filled By O.E.M. A320M-DVS R4.0/A320M-DVS R4.0, BIOS P7.20 05/10/2022 Apr 5 11:19:14 han kernel: RIP: 0010:sk_stream_kill_queues+0xd7/0xec Apr 5 11:19:14 han kernel: Code: c0 89 c2 89 c6 89 c7 e9 97 05 67 00 48 89 df e8 5f e0 fe ff 8b 83 50 01 00 00 8b b3 00 01 00 00 85 c0 74 d5 0f 0b 85 f6 74 d3 <0f> 0b 5b 5d 31 c0 89 c2 89 c6 89 c7 e9 68 05 67 00 0f 0b eb 94 0f Apr 5 11:19:14 han kernel: RSP: 0018:ffffb90280224c88 EFLAGS: 00010206 Apr 5 11:19:14 han kernel: RAX: 0000000000000000 RBX: ffff8dc1841eb340 RCX: 0000000000000000 Apr 5 11:19:14 han kernel: RDX: 0000000000000000 RSI: 0000000000000d00 RDI: 0000000000000000 Apr 5 11:19:14 han kernel: RBP: ffff8dc1841eb410 R08: 0000000000000000 R09: 0000000000000000 Apr 5 11:19:14 han kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff8dc1841eb340 Apr 5 11:19:14 han kernel: R13: ffff8dc34917ce7e R14: 0000000000000000 R15: ffff8dc34917ce7e Apr 5 11:19:14 han kernel: FS: 0000000000000000(0000) GS:ffff8dc4aea80000(0000) knlGS:0000000000000000 Apr 5 11:19:14 han kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 5 11:19:14 han kernel: CR2: 00007feffdd37b88 CR3: 0000000134bdc000 CR4: 00000000003506e0 Apr 5 11:19:14 han kernel: Call Trace: Apr 5 11:19:14 han kernel: Apr 5 11:19:14 han kernel: inet_csk_destroy_sock+0x55/0x113 Apr 5 11:19:14 han kernel: tcp_fin+0x123/0x19b Apr 5 11:19:14 han kernel: tcp_data_queue+0x478/0x598 Apr 5 11:19:14 han kernel: tcp_rcv_state_process+0x2b9/0x7bc Apr 5 11:19:14 han kernel: tcp_v6_do_rcv+0x1a1/0x496 Apr 5 11:19:14 han kernel: tcp_v6_rcv+0xd7d/0xdfa Apr 5 11:19:14 han kernel: ? nf_ct_deliver_cached_events+0x7f/0xb0 [nf_conntrack] Apr 5 11:19:14 han kernel: ? nf_confirm+0xd3/0x110 [nf_conntrack] Apr 5 11:19:14 han kernel: ip6_protocol_deliver_rcu+0xcc/0x515 Apr 5 11:19:14 han kernel: ip6_input+0xad/0xb6 Apr 5 11:19:14 han kernel: ? ip6_protocol_deliver_rcu+0x520/0x515 Apr 5 11:19:14 han kernel: netif_receive_skb_one_core+0x63/0xa1 Apr 5 11:19:14 han kernel: process_backlog+0x98/0x16c Apr 5 11:19:14 han kernel: napi_poll+0x2a/0x1a6 Apr 5 11:19:14 han kernel: net_rx_action+0x25e/0x316 Apr 5 11:19:14 han kernel: ? enqueue_hrtimer+0x2f/0x6e Apr 5 11:19:14 han kernel: do_softirq+0xd0/0x2a5 Apr 5 11:19:14 han kernel: irq_exit_rcu+0xc7/0xf1 Apr 5 11:19:14 han kernel: common_interrupt+0x80/0x98 Apr 5 11:19:14 han kernel: Apr 5 11:19:14 han kernel: Apr 5 11:19:14 han kernel: asm_common_interrupt+0x22/0x27 Apr 5 11:19:14 han kernel: RIP: 0010:native_safe_halt+0xb/0x10 Apr 5 11:19:14 han kernel: Code: 1b 91 19 02 48 03 04 d5 00 4b 8b ad 0f b6 57 09 8b 74 d0 04 8b 3c d0 e9 33 4f 3e ff cc cc cc eb 07 0f 00 2d 07 af 58 00 fb f4 e0 9d 36 00 eb 07 0f 00 2d f7 ae 58 00 f4 e9 d1 9d 36 00 cc 0f Apr 5 11:19:14 han kernel: RSP: 0018:ffffb902800dbe70 EFLAGS: 00000246 Apr 5 11:19:14 han kernel: RAX: 0000000000004000 RBX: 0000000000000001 RCX: 0000000000000000 Apr 5 11:19:14 han kernel: RDX: ffff8dc4aea80000 RSI: ffff8dc30ce52000 RDI: ffff8dc30ce52064 Apr 5 11:19:14 han kernel: RBP: ffff8dc30ce52064 R08: ffffffffae4f2340 R09: 0000000000000000 Apr 5 11:19:14 han kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 Apr 5 11:19:14 han kernel: R13: ffffffffae4f23c0 R14: 0000000000000001 R15: 0000000000000000 Apr 5 11:19:14 han kernel: acpi_idle_do_entry+0x64/0x8d Apr 5 11:19:14 han kernel: acpi_idle_enter+0x88/0xf7 Apr 5 11:19:14 han kernel: cpuidle_enter_state+0x89/0x35d Apr 5 11:19:14 han kernel: cpuidle_enter+0x29/0x40 Apr 5 11:19:14 han kernel: cpuidle_idle_call+0x143/0x1de Apr 5 11:19:14 han kernel: do_idle+0x81/0xd2 Apr 5 11:19:14 han kernel: cpu_startup_entry+0x19/0x1b Apr 5 11:19:14 han kernel: secondary_startup_64_no_verify+0xc2/0x0 Apr 5 11:19:14 han kernel: Apr 5 11:19:14 han kernel: ---[ end trace e2cdde453b50c9f5 ]---

emperortomato commented 1 year ago

Looks like it could be related to this Ubuntu issue - https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/2018960

tvierling commented 9 months ago

I'm sorry this report was overlooked. It appears that the corresponding Ubuntu issue was resolved with fixes from kernel.org stable trees, which UEK also pulls from.

Are you still experiencing these warns on up-to-date UEK6 (5.4.17) and UEK7 (5.15.0) kernels? If so, please reopen this issue with new context from those updated kernels.