multipath-tcp / mptcp_net-next

Development version of the Upstream MultiPath TCP Linux kernel 🐧
https://mptcp.dev
Other
284 stars 42 forks source link

several kernel warnings about mptcp #299

Closed wloot closed 1 year ago

wloot commented 2 years ago
# cat /proc/version 
Linux version 5.19.0-trunk-cloud-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PREEMPT_DYNAMIC Debian 5.19.6-1~exp1 (2022-08-02)

I'm configuring mptcp on v5.19.y, but after running for a while I noticed some warnings on dmesg, I'm not sure but here it is, I would appreciate it if you could take a look.

[ 7788.038778] ------------[ cut here ]------------
[ 7788.038828] WARNING: CPU: 0 PID: 0 at net/mptcp/protocol.c:699 __mptcp_move_skbs_from_subflow+0x319/0x8e0
[ 7788.039041] Modules linked in: nft_chain_nat(E) xt_nat(E) nf_nat(E) nft_compat(E) nf_tables(E) nfnetlink(E) intel_rapl_msr(E) intel_rapl_common(E) iosf_mbi(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) crypto_simd(E) cryptd(E) rapl(E) sg(E) evdev(E) virtio_balloon(E) serio_raw(E) qemu_fw_cfg(E) button(E) sch_fq(E) tcp_bbr(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) fuse(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) sr_mod(E) cdrom(E) virtio_blk(E) virtio_net(E) net_failover(E) failover(E) ata_generic(E) ata_piix(E) crc32c_intel(E) libata(E) virtio_pci(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E) scsi_mod(E) virtio(E) virtio_ring(E) scsi_common(E)
[ 7788.039465] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E     5.19.0-trunk-cloud-amd64 #1  Debian 5.19.6-1~exp1
[ 7788.039476] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
[ 7788.039495] RIP: 0010:__mptcp_move_skbs_from_subflow+0x319/0x8e0
[ 7788.039525] Code: 00 66 0f 45 7c 02 06 0f b7 c7 f0 41 01 85 b8 00 00 00 48 89 ef e8 f7 1f e2 ff 44 01 64 24 10 44 3b 64 24 14 0f 86 0e fe ff ff <0f> 0b e9 32 fe ff ff 41 89 47 4c 89 44 24 14 e9 9b fd ff ff 8b 54
[ 7788.039531] RSP: 0018:ffffbb3f40003a10 EFLAGS: 00010212
[ 7788.039536] RAX: 0000000000000001 RBX: ffff9d2cf3d3e900 RCX: 2acf95f35875d87e
[ 7788.039543] RDX: 0000000000000001 RSI: ffff9d2cfebbf080 RDI: ffff9d2d331d5f00
[ 7788.039546] RBP: ffff9d2d331d5f00 R08: 0000000000000000 R09: ffff9d2ccff18700
[ 7788.039549] R10: 0000000000000000 R11: ffff9d2cfebbf080 R12: 0000000000000533
[ 7788.039552] R13: ffff9d2cfebbf080 R14: ffff9d2cfebbf158 R15: ffff9d2ccff18700
[ 7788.039556] FS:  0000000000000000(0000) GS:ffff9d2d3dc00000(0000) knlGS:0000000000000000
[ 7788.039559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7788.039563] CR2: 00007fcce9f39108 CR3: 0000000003c20003 CR4: 0000000000370eb0
[ 7788.039571] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 7788.039573] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 7788.039582] Call Trace:
[ 7788.039640]  <IRQ>
[ 7788.039778]  mptcp_data_ready+0xb1/0x160
[ 7788.039794]  tcp_data_queue+0x8b9/0xef0
[ 7788.040051]  tcp_rcv_established+0x21c/0x7a0
[ 7788.040062]  tcp_v4_do_rcv+0x153/0x280
[ 7788.040258]  tcp_v4_rcv+0xea3/0xfa0
[ 7788.040269]  ? raw_local_deliver+0x136/0x240
[ 7788.040280]  ip_protocol_deliver_rcu+0x32/0x230
[ 7788.040301]  ip_local_deliver_finish+0x73/0xa0
[ 7788.040307]  ip_sublist_rcv_finish+0x80/0x90
[ 7788.040312]  ip_sublist_rcv+0x191/0x220
[ 7788.040317]  ? ip_sublist_rcv+0x220/0x220
[ 7788.040322]  ip_list_rcv+0x132/0x160
[ 7788.040332]  __netif_receive_skb_list_core+0x2b1/0x2e0
[ 7788.040351]  netif_receive_skb_list_internal+0x1d1/0x310
[ 7788.040356]  ? dev_gro_receive+0x15c/0x6a0
[ 7788.040365]  napi_complete_done+0x6f/0x1b0
[ 7788.040377]  virtnet_poll+0x2a0/0x45b [virtio_net]
[ 7788.040407]  __napi_poll+0x2c/0x160
[ 7788.040412]  net_rx_action+0x2a2/0x360
[ 7788.040416]  ? __napi_schedule+0x8b/0xb0
[ 7788.040425]  __do_softirq+0xeb/0x2eb
[ 7788.040444]  __irq_exit_rcu+0xc9/0x140
[ 7788.040465]  common_interrupt+0xb8/0xd0
[ 7788.040484]  </IRQ>
[ 7788.040486]  <TASK>
[ 7788.040488]  asm_common_interrupt+0x22/0x40
[ 7788.040501] RIP: 0010:native_safe_halt+0xb/0x10
[ 7788.040508] Code: c0 fb 01 00 f0 80 48 02 20 48 8b 00 a8 08 75 bf e9 79 ff ff ff cc cc cc cc cc cc cc cc cc cc 66 90 0f 00 2d 69 7b 48 00 fb f4 <c3> cc cc cc cc 66 90 0f 00 2d 59 7b 48 00 f4 c3 cc cc cc cc cc 0f
[ 7788.040512] RSP: 0018:ffffffff83c03e90 EFLAGS: 000002c2
[ 7788.040517] RAX: ffffffff831814a0 RBX: ffffffff83c199c0 RCX: 0000000000000000
[ 7788.040519] RDX: ffffffff83c199c0 RSI: ffffffff8391b62a RDI: ffffffff838ef806
[ 7788.040522] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 7788.040524] R10: 0000000000000003 R11: 000000000000003b R12: 0000000000000000
[ 7788.040526] R13: 0000000000000000 R14: ffffffff83c19120 R15: 0000000000000000
[ 7788.040529]  ? mwait_idle+0x80/0x80
[ 7788.040546]  default_idle+0xa/0x10
[ 7788.040551]  default_idle_call+0x36/0xf0
[ 7788.040557]  do_idle+0x1f7/0x270
[ 7788.040575]  cpu_startup_entry+0x19/0x20
[ 7788.040579]  rest_init+0xcb/0xd0
[ 7788.040593]  arch_call_rest_init+0xa/0x14
[ 7788.040797]  start_kernel+0x6db/0x704
[ 7788.040806]  secondary_startup_64_no_verify+0xd2/0xdb
[ 7788.040825]  </TASK>
[ 7788.040832] ---[ end trace 0000000000000000 ]---
[ 7788.047059] ------------[ cut here ]------------
[ 7788.047065] WARNING: CPU: 0 PID: 0 at net/mptcp/protocol.c:702 __mptcp_move_skbs_from_subflow+0x418/0x8e0
[ 7788.047090] Modules linked in: nft_chain_nat(E) xt_nat(E) nf_nat(E) nft_compat(E) nf_tables(E) nfnetlink(E) intel_rapl_msr(E) intel_rapl_common(E) iosf_mbi(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) crypto_simd(E) cryptd(E) rapl(E) sg(E) evdev(E) virtio_balloon(E) serio_raw(E) qemu_fw_cfg(E) button(E) sch_fq(E) tcp_bbr(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) fuse(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) sr_mod(E) cdrom(E) virtio_blk(E) virtio_net(E) net_failover(E) failover(E) ata_generic(E) ata_piix(E) crc32c_intel(E) libata(E) virtio_pci(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E) scsi_mod(E) virtio(E) virtio_ring(E) scsi_common(E)
[ 7788.047182] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W   E     5.19.0-trunk-cloud-amd64 #1  Debian 5.19.6-1~exp1
[ 7788.047188] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
[ 7788.047190] RIP: 0010:__mptcp_move_skbs_from_subflow+0x418/0x8e0
[ 7788.047197] Code: 00 00 48 89 28 41 8b 85 e8 00 00 00 83 c0 01 41 89 85 e8 00 00 00 e9 00 ff ff ff 89 87 18 01 00 00 89 44 24 18 e9 30 fc ff ff <0f> 0b e9 cd fc ff ff 48 8b bd e0 00 00 00 44 89 44 24 20 e8 50 da
[ 7788.047202] RSP: 0018:ffffbb3f40003a10 EFLAGS: 00010286
[ 7788.047207] RAX: 00000000000004a2 RBX: ffff9d2cf3d3e900 RCX: 000000000000052c
[ 7788.047210] RDX: 000000003d49420c RSI: ffff9d2cf3d3e900 RDI: ffff9d2cfebbf080
[ 7788.047213] RBP: ffff9d2ceea7d400 R08: 00000000fffffacd R09: ffffffff845a7a40
[ 7788.047216] R10: 00000000001d6954 R11: 0000000000000001 R12: 000000003d49473f
[ 7788.047219] R13: ffff9d2cfebbf080 R14: ffff9d2cfebbf080 R15: ffff9d2ccff18700
[ 7788.047222] FS:  0000000000000000(0000) GS:ffff9d2d3dc00000(0000) knlGS:0000000000000000
[ 7788.047226] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7788.047229] CR2: 00007fcce9f39108 CR3: 0000000003c20003 CR4: 0000000000370eb0
[ 7788.047237] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 7788.047239] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 7788.047242] Call Trace:
[ 7788.047245]  <IRQ>
[ 7788.047252]  mptcp_data_ready+0xb1/0x160
[ 7788.047259]  tcp_data_queue+0x8b9/0xef0
[ 7788.047267]  tcp_rcv_established+0x21c/0x7a0
[ 7788.047274]  tcp_v4_do_rcv+0x153/0x280
[ 7788.047278]  tcp_v4_rcv+0xea3/0xfa0
[ 7788.047288]  ip_protocol_deliver_rcu+0x32/0x230
[ 7788.047295]  ip_local_deliver_finish+0x73/0xa0
[ 7788.047300]  ip_sublist_rcv_finish+0x80/0x90
[ 7788.047305]  ip_sublist_rcv+0x191/0x220
[ 7788.047311]  ? ip_sublist_rcv+0x220/0x220
[ 7788.047317]  ip_list_rcv+0x132/0x160
[ 7788.047323]  __netif_receive_skb_list_core+0x2b1/0x2e0
[ 7788.047329]  netif_receive_skb_list_internal+0x1d1/0x310
[ 7788.047335]  napi_complete_done+0x6f/0x1b0
[ 7788.047340]  virtnet_poll+0x2a0/0x45b [virtio_net]
[ 7788.047352]  __napi_poll+0x2c/0x160
[ 7788.047366]  net_rx_action+0x2a2/0x360
[ 7788.047370]  ? __napi_schedule+0x8b/0xb0
[ 7788.047378]  __do_softirq+0xeb/0x2eb
[ 7788.047384]  __irq_exit_rcu+0xc9/0x140
[ 7788.047390]  common_interrupt+0xb8/0xd0
[ 7788.047396]  </IRQ>
[ 7788.047397]  <TASK>
[ 7788.047400]  asm_common_interrupt+0x22/0x40
[ 7788.047407] RIP: 0010:native_safe_halt+0xb/0x10
[ 7788.047414] Code: c0 fb 01 00 f0 80 48 02 20 48 8b 00 a8 08 75 bf e9 79 ff ff ff cc cc cc cc cc cc cc cc cc cc 66 90 0f 00 2d 69 7b 48 00 fb f4 <c3> cc cc cc cc 66 90 0f 00 2d 59 7b 48 00 f4 c3 cc cc cc cc cc 0f
[ 7788.047418] RSP: 0018:ffffffff83c03e90 EFLAGS: 000002c2
[ 7788.047422] RAX: ffffffff831814a0 RBX: ffffffff83c199c0 RCX: ffff9d2d3dc31740
[ 7788.047425] RDX: ffffffff83c199c0 RSI: ffffffff8391b62a RDI: ffffffff838ef806
[ 7788.047428] RBP: 0000000000000000 R08: 000007154bc21e84 R09: 0000000000000000
[ 7788.047430] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 7788.047432] R13: 0000000000000000 R14: ffffffff83c19120 R15: 0000000000000000
[ 7788.047436]  ? mwait_idle+0x80/0x80
[ 7788.047444]  default_idle+0xa/0x10
[ 7788.047449]  default_idle_call+0x36/0xf0
[ 7788.047455]  do_idle+0x1f7/0x270
[ 7788.047460]  cpu_startup_entry+0x19/0x20
[ 7788.047464]  rest_init+0xcb/0xd0
[ 7788.047477]  arch_call_rest_init+0xa/0x14
[ 7788.047485]  start_kernel+0x6db/0x704
[ 7788.047492]  secondary_startup_64_no_verify+0xd2/0xdb
[ 7788.047501]  </TASK>
[ 7788.047503] ---[ end trace 0000000000000000 ]---
[ 7788.057067] ------------[ cut here ]------------
[ 7788.057071] WARNING: CPU: 0 PID: 0 at net/mptcp/subflow.c:863 mptcp_subflow_data_available+0xe43/0xf10
[ 7788.057083] Modules linked in: nft_chain_nat(E) xt_nat(E) nf_nat(E) nft_compat(E) nf_tables(E) nfnetlink(E) intel_rapl_msr(E) intel_rapl_common(E) iosf_mbi(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) crypto_simd(E) cryptd(E) rapl(E) sg(E) evdev(E) virtio_balloon(E) serio_raw(E) qemu_fw_cfg(E) button(E) sch_fq(E) tcp_bbr(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) fuse(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) sr_mod(E) cdrom(E) virtio_blk(E) virtio_net(E) net_failover(E) failover(E) ata_generic(E) ata_piix(E) crc32c_intel(E) libata(E) virtio_pci(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E) scsi_mod(E) virtio(E) virtio_ring(E) scsi_common(E)
[ 7788.057165] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W   E     5.19.0-trunk-cloud-amd64 #1  Debian 5.19.6-1~exp1
[ 7788.057170] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
[ 7788.057172] RIP: 0010:mptcp_subflow_data_available+0xe43/0xf10
[ 7788.057179] Code: 48 c7 c6 a0 42 97 83 48 c7 c7 b0 08 e1 83 4c 89 5c 24 08 4c 89 0c 24 e8 2b 10 c6 ff 4c 8b 5c 24 08 4c 8b 0c 24 e9 60 fe ff ff <0f> 0b e9 21 f4 ff ff 0f 0b e9 64 fd ff ff 48 8b 35 10 e1 cf 00 49
[ 7788.057183] RSP: 0018:ffffbb3f40003a30 EFLAGS: 00010282
[ 7788.057188] RAX: 000000003d494be1 RBX: ffff9d2ccff18700 RCX: 000000003d49420c
[ 7788.057191] RDX: 00000000000004a2 RSI: 00000000fffff62b RDI: 2acf95f35875ddb1
[ 7788.057193] RBP: ffff9d2cf3d3e900 R08: ffff9d2cec21e548 R09: ffff9d2ccff18700
[ 7788.057196] R10: 0000000000000000 R11: ffff9d2cfebbf080 R12: ffff9d2ce2821900
[ 7788.057199] R13: 0000000000000000 R14: ffff9d2ce2821900 R15: ffff9d2cf3d3e9d8
[ 7788.057202] FS:  0000000000000000(0000) GS:ffff9d2d3dc00000(0000) knlGS:0000000000000000
[ 7788.057205] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7788.057208] CR2: 00007fcce9f39108 CR3: 0000000003c20003 CR4: 0000000000370eb0
[ 7788.057215] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 7788.057217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 7788.057220] Call Trace:
[ 7788.057229]  <IRQ>
[ 7788.057234]  subflow_data_ready+0x74/0x100
[ 7788.057244]  tcp_data_queue+0x8b9/0xef0
[ 7788.057252]  tcp_rcv_established+0x21c/0x7a0
[ 7788.057258]  tcp_v4_do_rcv+0x153/0x280
[ 7788.057265]  tcp_v4_rcv+0xea3/0xfa0
[ 7788.057269]  ? raw_local_deliver+0x136/0x240
[ 7788.057276]  ip_protocol_deliver_rcu+0x32/0x230
[ 7788.057281]  ip_local_deliver_finish+0x73/0xa0
[ 7788.057286]  ip_sublist_rcv_finish+0x80/0x90
[ 7788.057291]  ip_sublist_rcv+0x191/0x220
[ 7788.057297]  ? ip_sublist_rcv+0x220/0x220
[ 7788.057302]  ip_list_rcv+0x132/0x160
[ 7788.057307]  __netif_receive_skb_list_core+0x2b1/0x2e0
[ 7788.057313]  netif_receive_skb_list_internal+0x1d1/0x310
[ 7788.057318]  napi_complete_done+0x6f/0x1b0
[ 7788.057322]  virtnet_poll+0x2a0/0x45b [virtio_net]
[ 7788.057333]  __napi_poll+0x2c/0x160
[ 7788.057337]  net_rx_action+0x2a2/0x360
[ 7788.057342]  __do_softirq+0xeb/0x2eb
[ 7788.057347]  __irq_exit_rcu+0xc9/0x140
[ 7788.057352]  sysvec_apic_timer_interrupt+0xa2/0xd0
[ 7788.057358]  </IRQ>
[ 7788.057359]  <TASK>
[ 7788.057361]  asm_sysvec_apic_timer_interrupt+0x16/0x20
[ 7788.057369] RIP: 0010:native_safe_halt+0xb/0x10
[ 7788.057374] Code: c0 fb 01 00 f0 80 48 02 20 48 8b 00 a8 08 75 bf e9 79 ff ff ff cc cc cc cc cc cc cc cc cc cc 66 90 0f 00 2d 69 7b 48 00 fb f4 <c3> cc cc cc cc 66 90 0f 00 2d 59 7b 48 00 f4 c3 cc cc cc cc cc 0f
[ 7788.057378] RSP: 0018:ffffffff83c03e90 EFLAGS: 000002c2
[ 7788.057382] RAX: ffffffff831814a0 RBX: ffffffff83c199c0 RCX: ffff9d2d3dc31740
[ 7788.057384] RDX: ffffffff83c199c0 RSI: ffffffff8391b62a RDI: ffffffff838ef806
[ 7788.057387] RBP: 0000000000000000 R08: 000007154c5e469c R09: 0000000000000000
[ 7788.057389] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 7788.057391] R13: 0000000000000000 R14: ffffffff83c19120 R15: 0000000000000000
[ 7788.057395]  ? mwait_idle+0x80/0x80
[ 7788.057402]  default_idle+0xa/0x10
[ 7788.057407]  default_idle_call+0x36/0xf0
[ 7788.057412]  do_idle+0x1f7/0x270
[ 7788.057417]  cpu_startup_entry+0x19/0x20
[ 7788.057421]  rest_init+0xcb/0xd0
[ 7788.057426]  arch_call_rest_init+0xa/0x14
[ 7788.057433]  start_kernel+0x6db/0x704
[ 7788.057439]  secondary_startup_64_no_verify+0xd2/0xdb
[ 7788.057447]  </TASK>
[ 7788.057448] ---[ end trace 0000000000000000 ]---

[13368.816791] ------------[ cut here ]------------
[13368.816826] refcount_t: addition on 0; use-after-free.
[13368.817130] WARNING: CPU: 0 PID: 0 at lib/refcount.c:25 refcount_warn_saturate+0x74/0x110
[13368.817171] Modules linked in: nft_chain_nat(E) xt_nat(E) nf_nat(E) nft_compat(E) nf_tables(E) nfnetlink(E) intel_rapl_msr(E) intel_rapl_common(E) iosf_mbi(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) crypto_simd(E) cryptd(E) rapl(E) sg(E) evdev(E) virtio_balloon(E) serio_raw(E) qemu_fw_cfg(E) button(E) sch_fq(E) tcp_bbr(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) fuse(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) sr_mod(E) cdrom(E) virtio_blk(E) virtio_net(E) net_failover(E) failover(E) ata_generic(E) ata_piix(E) crc32c_intel(E) libata(E) virtio_pci(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E) scsi_mod(E) virtio(E) virtio_ring(E) scsi_common(E)
[13368.817430] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W   E     5.19.0-trunk-cloud-amd64 #1  Debian 5.19.6-1~exp1
[13368.817452] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
[13368.817459] RIP: 0010:refcount_warn_saturate+0x74/0x110
[13368.817478] Code: 01 01 e8 39 49 3e 00 0f 0b c3 cc cc cc cc 80 3d 5e 89 09 01 00 75 cb 48 c7 c7 00 f5 92 83 c6 05 4e 89 09 01 01 e8 16 49 3e 00 <0f> 0b c3 cc cc cc cc 80 3d 3d 89 09 01 00 75 a8 48 c7 c7 d8 f4 92
[13368.817484] RSP: 0018:ffffbb3f400039b0 EFLAGS: 00010282
[13368.817489] RAX: 0000000000000000 RBX: ffff9d2d14f75100 RCX: 0000000000000000
[13368.817496] RDX: 0000000000000103 RSI: ffffffff8391b62a RDI: 00000000ffffffff
[13368.817499] RBP: ffff9d2cd37da300 R08: 0000000000000000 R09: 00000000ffffefff
[13368.817502] R10: ffffbb3f40003850 R11: ffffffff83ccfe48 R12: 0000000000000001
[13368.817505] R13: ffff9d2cd7572800 R14: ffff9d2cc471b480 R15: 4f4be6fe73acef61
[13368.817508] FS:  0000000000000000(0000) GS:ffff9d2d3dc00000(0000) knlGS:0000000000000000
[13368.817512] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13368.817516] CR2: 00007f7c5bd82a60 CR3: 0000000003c20004 CR4: 0000000000370eb0
[13368.817523] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[13368.817526] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[13368.817534] Call Trace:
[13368.817569]  <IRQ>
[13368.817580]  mptcp_incoming_options+0x98c/0x9a0
[13368.817672]  tcp_data_queue+0x62a/0xef0
[13368.817694]  tcp_rcv_state_process+0x5d5/0x1020
[13368.817700]  ? sk_filter_trim_cap+0xc1/0x240
[13368.817712]  tcp_v4_do_rcv+0xc6/0x280
[13368.817725]  tcp_v4_rcv+0xea3/0xfa0
[13368.817734]  ? raw_local_deliver+0x136/0x240
[13368.817741]  ip_protocol_deliver_rcu+0x32/0x230
[13368.817760]  ip_local_deliver_finish+0x73/0xa0
[13368.817766]  ip_sublist_rcv_finish+0x80/0x90
[13368.817771]  ip_sublist_rcv+0x191/0x220
[13368.817777]  ? ip_sublist_rcv+0x220/0x220
[13368.817782]  ip_list_rcv+0x132/0x160
[13368.817792]  __netif_receive_skb_list_core+0x2b1/0x2e0
[13368.817811]  netif_receive_skb_list_internal+0x1d1/0x310
[13368.817817]  napi_complete_done+0x6f/0x1b0
[13368.817897]  virtnet_poll+0x2a0/0x45b [virtio_net]
[13368.817930]  __napi_poll+0x2c/0x160
[13368.817935]  net_rx_action+0x2a2/0x360
[13368.817940]  ? __napi_schedule+0x8b/0xb0
[13368.817948]  __do_softirq+0xeb/0x2eb
[13368.817967]  __irq_exit_rcu+0xc9/0x140
[13368.817987]  common_interrupt+0xb8/0xd0
[13368.818006]  </IRQ>
[13368.818008]  <TASK>
[13368.818011]  asm_common_interrupt+0x22/0x40
[13368.818024] RIP: 0010:native_safe_halt+0xb/0x10
[13368.818031] Code: c0 fb 01 00 f0 80 48 02 20 48 8b 00 a8 08 75 bf e9 79 ff ff ff cc cc cc cc cc cc cc cc cc cc 66 90 0f 00 2d 69 7b 48 00 fb f4 <c3> cc cc cc cc 66 90 0f 00 2d 59 7b 48 00 f4 c3 cc cc cc cc cc 0f
[13368.818035] RSP: 0018:ffffffff83c03e90 EFLAGS: 000002c2
[13368.818039] RAX: ffffffff831814a0 RBX: ffffffff83c199c0 RCX: ffff9d2d3dc31740
[13368.818042] RDX: ffffffff83c199c0 RSI: ffffffff8391b62a RDI: ffffffff838ef806
[13368.818044] RBP: 0000000000000000 R08: 00000c28ab788cd4 R09: 0000000000000000
[13368.818047] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[13368.818049] R13: 0000000000000000 R14: ffffffff83c19120 R15: 0000000000000000
[13368.818053]  ? mwait_idle+0x80/0x80
[13368.818075]  default_idle+0xa/0x10
[13368.818081]  default_idle_call+0x36/0xf0
[13368.818087]  do_idle+0x1f7/0x270
[13368.818140]  ? do_idle+0x7/0x270
[13368.818147]  cpu_startup_entry+0x19/0x20
[13368.818153]  rest_init+0xcb/0xd0
[13368.818176]  arch_call_rest_init+0xa/0x14
[13368.818248]  start_kernel+0x6db/0x704
[13368.818260]  secondary_startup_64_no_verify+0xd2/0xdb
[13368.818326]  </TASK>
[13368.818330] ---[ end trace 0000000000000000 ]---
[13368.990412] ------------[ cut here ]------------
[13368.990419] refcount_t: underflow; use-after-free.
[13368.990450] WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
[13368.990465] Modules linked in: nft_chain_nat(E) xt_nat(E) nf_nat(E) nft_compat(E) nf_tables(E) nfnetlink(E) intel_rapl_msr(E) intel_rapl_common(E) iosf_mbi(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) crypto_simd(E) cryptd(E) rapl(E) sg(E) evdev(E) virtio_balloon(E) serio_raw(E) qemu_fw_cfg(E) button(E) sch_fq(E) tcp_bbr(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) fuse(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) sr_mod(E) cdrom(E) virtio_blk(E) virtio_net(E) net_failover(E) failover(E) ata_generic(E) ata_piix(E) crc32c_intel(E) libata(E) virtio_pci(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E) scsi_mod(E) virtio(E) virtio_ring(E) scsi_common(E)
[13368.990547] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W   E     5.19.0-trunk-cloud-amd64 #1  Debian 5.19.6-1~exp1
[13368.990554] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
[13368.990556] RIP: 0010:refcount_warn_saturate+0xba/0x110
[13368.990563] Code: 01 01 e8 f3 48 3e 00 0f 0b c3 cc cc cc cc 80 3d 17 89 09 01 00 75 85 48 c7 c7 30 f5 92 83 c6 05 07 89 09 01 01 e8 d0 48 3e 00 <0f> 0b c3 cc cc cc cc 80 3d f2 88 09 01 00 0f 85 5e ff ff ff 48 c7
[13368.990568] RSP: 0018:ffffbb3f40003a88 EFLAGS: 00010282
[13368.990573] RAX: 0000000000000000 RBX: ffff9d2cd37da300 RCX: 0000000000000000
[13368.990576] RDX: 0000000000000103 RSI: ffffffff8391b62a RDI: 00000000ffffffff
[13368.990579] RBP: ffff9d2d14f75100 R08: 0000000000000000 R09: 00000000ffffefff
[13368.990582] R10: ffffbb3f40003928 R11: ffffffff83ccfe48 R12: 0000000000000001
[13368.990584] R13: ffff9d2ce8f7c262 R14: ffff9d2cd37da398 R15: ffff9d2ce8f7c262
[13368.990588] FS:  0000000000000000(0000) GS:ffff9d2d3dc00000(0000) knlGS:0000000000000000
[13368.990592] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13368.990595] CR2: 00007f7c5bb76260 CR3: 0000000004da8003 CR4: 0000000000370eb0
[13368.990603] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[13368.990605] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[13368.990608] Call Trace:
[13368.990613]  <IRQ>
[13368.990616]  subflow_ulp_release+0xcb/0xd0
[13368.990649]  tcp_cleanup_ulp+0x23/0x50
[13368.990661]  tcp_v4_destroy_sock+0x55/0x210
[13368.992284]  inet_csk_destroy_sock+0x4d/0x110
[13368.992308]  tcp_rcv_state_process+0x7e5/0x1020
[13368.992315]  ? sk_filter_trim_cap+0xc1/0x240
[13368.992323]  tcp_v4_do_rcv+0xc6/0x280
[13368.992328]  tcp_v4_rcv+0xea3/0xfa0
[13368.992332]  ? raw_local_deliver+0x136/0x240
[13368.992339]  ip_protocol_deliver_rcu+0x32/0x230
[13368.992345]  ip_local_deliver_finish+0x73/0xa0
[13368.992350]  ip_sublist_rcv_finish+0x80/0x90
[13368.992355]  ip_sublist_rcv+0x191/0x220
[13368.992361]  ? ip_sublist_rcv+0x220/0x220
[13368.992366]  ip_list_rcv+0x132/0x160
[13368.992373]  __netif_receive_skb_list_core+0x2b1/0x2e0
[13368.992380]  netif_receive_skb_list_internal+0x1d1/0x310
[13368.992384]  ? dev_gro_receive+0x15c/0x6a0
[13368.992392]  napi_complete_done+0x6f/0x1b0
[13368.992398]  virtnet_poll+0x2a0/0x45b [virtio_net]
[13368.992410]  __napi_poll+0x2c/0x160
[13368.992415]  net_rx_action+0x2a2/0x360
[13368.992419]  ? __napi_schedule+0x8b/0xb0
[13368.992426]  __do_softirq+0xeb/0x2eb
[13368.992434]  __irq_exit_rcu+0xc9/0x140
[13368.992440]  common_interrupt+0xb8/0xd0
[13368.992447]  </IRQ>
[13368.992449]  <TASK>
[13368.992451]  asm_common_interrupt+0x22/0x40
[13368.992460] RIP: 0010:native_safe_halt+0xb/0x10
[13368.992467] Code: c0 fb 01 00 f0 80 48 02 20 48 8b 00 a8 08 75 bf e9 79 ff ff ff cc cc cc cc cc cc cc cc cc cc 66 90 0f 00 2d 69 7b 48 00 fb f4 <c3> cc cc cc cc 66 90 0f 00 2d 59 7b 48 00 f4 c3 cc cc cc cc cc 0f
[13368.992478] RSP: 0018:ffffffff83c03e90 EFLAGS: 000002c2
[13368.992483] RAX: ffffffff831814a0 RBX: ffffffff83c199c0 RCX: 0000000000000000
[13368.992486] RDX: ffffffff83c199c0 RSI: ffffffff8391b62a RDI: ffffffff838ef806
[13368.992488] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[13368.992491] R10: 0000000000000003 R11: 000000000000001b R12: 0000000000000000
[13368.992493] R13: 0000000000000000 R14: ffffffff83c19120 R15: 0000000000000000
[13368.992497]  ? mwait_idle+0x80/0x80
[13368.992505]  default_idle+0xa/0x10
[13368.992511]  default_idle_call+0x36/0xf0
[13368.992516]  do_idle+0x1f7/0x270
[13368.992521]  ? do_idle+0x7/0x270
[13368.992525]  cpu_startup_entry+0x19/0x20
[13368.992529]  rest_init+0xcb/0xd0
[13368.992536]  arch_call_rest_init+0xa/0x14
[13368.992545]  start_kernel+0x6db/0x704
[13368.992552]  secondary_startup_64_no_verify+0xd2/0xdb
[13368.992562]  </TASK>
[13368.992564] ---[ end trace 0000000000000000 ]---
matttbe commented 2 years ago

Hi @wloot,

Thank you for having reported these warnings!

Would you mind sharing a few more details about your setup please?

Also, by chance, do you have the kernel debug symbols? If yes, could you run ./scripts/decode_stacktrace.sh script to decode the stacktraces you shared here?

https://github.com/multipath-tcp/mptcp_net-next/blob/export/scripts/decode_stacktrace.sh https://www.desmondcheong.com/blog/2021/06/02/decoding-stack-traces-in-the-linux-kernel/

matttbe commented 2 years ago

I found those warnings from a MPTCP client which running Debian 11 with latest stable kernel built by debian's kernel package scripts.

Great, I guess it means the debug symbols have been packaged somewhere, can be installed/extracted and decode_stacktrace.sh can be used? (if you don't mind :) )

So you have the issue with one client having 2 IPs (on one interface but that's the same from an MPTCP point of view), communicating with a server running Linux v5.18, right? That doesn't seem that uncommon.

Also were you previously using an older kernel version on your client? Or could you check if you have these warnings on v5.18?

Last question: did you notice issues from the userspace? (Connections have been stalled/stopped?) Or only noticed the warnings when monitoring the kernel messages?

wloot commented 2 years ago

Sorry I have the kernel with DEBUG_INFO disabled, but I will rebuild and do it later. And yes, in fact the reason I built kernel myself is exactly the warnings. In v5.18 server side there are also some dmesg warning, I will post them too if you want a look. By the way I also noticed sometimes the memory usage is higer than usual after I enabled MPTCP on client, userspace side usage and /proc/net/sockstat reports are normal. I didn't notice proxy connection issues from userspace other than normal, perhaps there is sort of auto reconnect in userspace?

wloot commented 2 years ago

Hi @matttbe, I found out that the dmesg warnings on server side are very similar. I did the script on a MPTCP server with debian packaged kernel(remind not latest one, kernel version is 5.18.2), hope these able to help:

[519142.831986] ------------[ cut here ]------------
[519142.832051] WARNING: CPU: 0 PID: 0 at net/mptcp/protocol.c:693 __mptcp_move_skbs_from_subflow (net/mptcp/protocol.c:693 (discriminator 1)) 
[519142.832104] Modules linked in: wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel curve25519_x86_64 libcurve25519_generic libchacha intel_rapl_msr intel_rapl_common iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ipt_REJECT nf_reject_ipv4 aesni_intel xt_hl xt_tcpudp crypto_simd cryptd xt_state xt_conntrack evdev nft_compat serio_raw sg nf_tables qemu_fw_cfg virtio_balloon button nfnetlink sch_fq tcp_bbr nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c fuse configfs ip_tables x_tables autofs4 virtio_net sr_mod cdrom net_failover failover virtio_blk ata_generic ata_piix libata crc32c_intel virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring scsi_mod scsi_common
[519142.832408] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.18.0-0.bpo.1-cloud-amd64 #1  Debian 5.18.2-1~bpo11+1
[519142.832433] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
[519142.832438] RIP: 0010:__mptcp_move_skbs_from_subflow (net/mptcp/protocol.c:693 (discriminator 1)) 
./decode_stacktrace.sh: line 211: ./decodecode: No such file or directory
[519142.832450] RSP: 0018:ffffa8c8c0003a28 EFLAGS: 00010216
[519142.832454] RAX: 000037820105dd08 RBX: ffff914683270000 RCX: 0000000080100002
[519142.832456] RDX: 0000000080100003 RSI: fffff890401489c0 RDI: 0000000040000000
[519142.832458] RBP: ffff914685227700 R08: ffff914685227700 R09: 0000000080100002
[519142.832460] R10: 0000000000000557 R11: ffff914681e72200 R12: 0000000000000557
[519142.832462] R13: ffff9146b39d2000 R14: f4d25dc7361ba9a8 R15: ffff9146b6051b00
[519142.832465] FS:  0000000000000000(0000) GS:ffff9146bec00000(0000) knlGS:0000000000000000
[519142.832467] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[519142.832469] CR2: 00007fc0b06f06f8 CR3: 00000000022ae005 CR4: 00000000001706b0
[519142.832478] Call Trace:
[519142.832550]  <IRQ>
[519142.832574] mptcp_data_ready (net/mptcp/protocol.c:767 net/mptcp/protocol.c:811) 
[519142.832580] tcp_data_queue (net/ipv4/tcp_input.c:5062) 
[519142.832604] tcp_rcv_established (include/linux/skbuff.h:2027 include/net/tcp.h:1853 include/net/tcp.h:1926 net/ipv4/tcp_input.c:5482 net/ipv4/tcp_input.c:5981) 
[519142.832608] tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1706) 
[519142.832617] tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2074) 
[519142.832622] ? nf_hook_slow (include/linux/netfilter.h:142 net/netfilter/core.c:620) 
[519142.832634] ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1)) 
[519142.832647] ip_local_deliver_finish (include/linux/rcupdate.h:726 net/ipv4/ip_input.c:234) 
[519142.832654] ip_sublist_rcv_finish (include/net/dst.h:461 net/ipv4/ip_input.c:567) 
[519142.832658] ip_sublist_rcv (net/ipv4/ip_input.c:626) 
[519142.832662] ? ip_sublist_rcv (net/ipv4/ip_input.c:424) 
[519142.832677] ip_list_rcv (net/ipv4/ip_input.c:661) 
[519142.832683] __netif_receive_skb_list_core (net/core/dev.c:5448 net/core/dev.c:5496) 
[519142.832722] netif_receive_skb_list_internal (net/core/dev.c:5550 net/core/dev.c:5639) 
[519142.832730] napi_complete_done (include/linux/list.h:37 include/net/gro.h:431 include/net/gro.h:426 net/core/dev.c:5980) 
[519142.832738] virtnet_poll (drivers/net/virtio_net.c:366 drivers/net/virtio_net.c:1631) virtio_net
[519142.832758] __napi_poll (net/core/dev.c:6413) 
[519142.832762] net_rx_action (net/core/dev.c:6482 net/core/dev.c:6567) 
[519142.832769] ? __napi_schedule (arch/x86/include/asm/irqflags.h:137 net/core/dev.c:5890) 
[519142.832801] __do_softirq (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:559) 
[519142.832841] __irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:637) 
[519142.832871] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) 
[519142.832896]  </IRQ>
[519142.832898]  <TASK>
[519142.832900] asm_common_interrupt (/build/linux-23EX6j/linux-5.18.2/arch/x86/include/asm/idtentry.h:636) 
[519142.832907] RIP: 0010:native_safe_halt (arch/x86/include/asm/irqflags.h:52) 
./decode_stacktrace.sh: line 211: ./decodecode: No such file or directory
[519142.832916] RSP: 0018:ffffffffb3e03e90 EFLAGS: 00000282
[519142.832920] RAX: ffffffffb313a360 RBX: ffffffffb3e19940 RCX: 7fffffffffffffff
[519142.832922] RDX: ffffffffb3e19940 RSI: ffffffffb390d946 RDI: ffffffffb38e1840
[519142.832924] RBP: 0000000000000000 R08: 000000cd42e4dffb R09: 0000000000000000
[519142.832926] R10: 0000000000000000 R11: 0000000000000058 R12: 0000000000000000
[519142.832928] R13: 0000000000000000 R14: ffffffffb3e19120 R15: 0000000000000000
[519142.832930] ? mwait_idle (arch/x86/kernel/process.c:732) 
[519142.832944] default_idle (arch/x86/include/asm/paravirt.h:173 arch/x86/kernel/process.c:733) 
[519142.832948] default_idle_call (arch/x86/include/asm/paravirt.h:699 kernel/sched/idle.c:117) 
[519142.832952] do_idle (kernel/sched/idle.c:192 kernel/sched/idle.c:303) 
[519142.832974] cpu_startup_entry (kernel/sched/idle.c:399 (discriminator 1)) 
[519142.832978] rest_init (??:?) 
[519142.832989] arch_call_rest_init+0xa/0x10 
[519142.833032] start_kernel (init/main.c:1139) 
[519142.833036] secondary_startup_64_no_verify (/build/linux-23EX6j/linux-5.18.2/arch/x86/kernel/head_64.S:303) 
[519142.833061]  </TASK>
[519142.833066] ---[ end trace 0000000000000000 ]---
[519142.866383] ------------[ cut here ]------------
[519142.866389] WARNING: CPU: 0 PID: 0 at net/mptcp/protocol.c:696 __mptcp_move_skbs_from_subflow (net/mptcp/protocol.c:696 (discriminator 1)) 
[519142.866400] Modules linked in: wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel curve25519_x86_64 libcurve25519_generic libchacha intel_rapl_msr intel_rapl_common iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ipt_REJECT nf_reject_ipv4 aesni_intel xt_hl xt_tcpudp crypto_simd cryptd xt_state xt_conntrack evdev nft_compat serio_raw sg nf_tables qemu_fw_cfg virtio_balloon button nfnetlink sch_fq tcp_bbr nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c fuse configfs ip_tables x_tables autofs4 virtio_net sr_mod cdrom net_failover failover virtio_blk ata_generic ata_piix libata crc32c_intel virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring scsi_mod scsi_common
[519142.866457] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.18.0-0.bpo.1-cloud-amd64 #1  Debian 5.18.2-1~bpo11+1
[519142.866461] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
[519142.866463] RIP: 0010:__mptcp_move_skbs_from_subflow (net/mptcp/protocol.c:696 (discriminator 1)) 
./decode_stacktrace.sh: line 211: ./decodecode: No such file or directory
[519142.866469] RSP: 0018:ffffa8c8c0003a28 EFLAGS: 00010282
[519142.866471] RAX: 0000000000000557 RBX: ffff914683270000 RCX: 00000000000003d9
[519142.866473] RDX: 00000000d1fb83dd RSI: ffff914683270000 RDI: 0000000000289718
[519142.866474] RBP: ffff914683e86000 R08: 00000000fffffaa9 R09: ffff914683e86000
[519142.866475] R10: 0000000000144b8c R11: 0000000000000000 R12: 00000000d1fb8934
[519142.866477] R13: ffff9146b39d2000 R14: ffff9146b39d2000 R15: ffff9146b6051b00
[519142.866479] FS:  0000000000000000(0000) GS:ffff9146bec00000(0000) knlGS:0000000000000000
[519142.866480] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[519142.866482] CR2: 00007f7d86986b80 CR3: 0000000032370002 CR4: 00000000001706b0
[519142.866486] Call Trace:
[519142.866490]  <IRQ>
[519142.866495] mptcp_data_ready (net/mptcp/protocol.c:767 net/mptcp/protocol.c:811) 
[519142.866499] tcp_data_queue (net/ipv4/tcp_input.c:5062) 
[519142.866506] tcp_rcv_established (include/linux/skbuff.h:2027 include/net/tcp.h:1853 include/net/tcp.h:1926 net/ipv4/tcp_input.c:5482 net/ipv4/tcp_input.c:5981) 
[519142.866509] tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1706) 
[519142.866532] tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2074) 
[519142.866535] ? nf_hook_slow (include/linux/netfilter.h:142 net/netfilter/core.c:620) 
[519142.866550] ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1)) 
[519142.866555] ip_local_deliver_finish (include/linux/rcupdate.h:726 net/ipv4/ip_input.c:234) 
[519142.866558] ip_sublist_rcv_finish (include/net/dst.h:461 net/ipv4/ip_input.c:567) 
[519142.866562] ip_sublist_rcv (net/ipv4/ip_input.c:626) 
[519142.866565] ? ip_sublist_rcv (net/ipv4/ip_input.c:424) 
[519142.866568] ip_list_rcv (net/ipv4/ip_input.c:661) 
[519142.866571] __netif_receive_skb_list_core (net/core/dev.c:5448 net/core/dev.c:5496) 
[519142.866577] netif_receive_skb_list_internal (net/core/dev.c:5550 net/core/dev.c:5639) 
[519142.866581] napi_complete_done (include/linux/list.h:37 include/net/gro.h:431 include/net/gro.h:426 net/core/dev.c:5980) 
[519142.866584] virtnet_poll (drivers/net/virtio_net.c:366 drivers/net/virtio_net.c:1631) virtio_net
[519142.866591] __napi_poll (net/core/dev.c:6413) 
[519142.866594] net_rx_action (net/core/dev.c:6482 net/core/dev.c:6567) 
[519142.866597] ? __napi_schedule (arch/x86/include/asm/irqflags.h:137 net/core/dev.c:5890) 
[519142.866600] __do_softirq (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:559) 
[519142.866606] __irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:637) 
[519142.866610] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) 
[519142.866616]  </IRQ>
[519142.866616]  <TASK>
[519142.866617] asm_common_interrupt (/build/linux-23EX6j/linux-5.18.2/arch/x86/include/asm/idtentry.h:636) 
[519142.866620] RIP: 0010:native_safe_halt (arch/x86/include/asm/irqflags.h:52) 
./decode_stacktrace.sh: line 211: ./decodecode: No such file or directory
[519142.866626] RSP: 0018:ffffffffb3e03e90 EFLAGS: 00000282
[519142.866628] RAX: ffffffffb313a360 RBX: ffffffffb3e19940 RCX: ffff9146bec31780
[519142.866629] RDX: ffffffffb3e19940 RSI: ffffffffb390d946 RDI: ffffffffb38e1840
[519142.866630] RBP: 0000000000000000 R08: 0001d8285e0a773c R09: 0000000000000000
[519142.866631] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[519142.866632] R13: 0000000000000000 R14: ffffffffb3e19120 R15: 0000000000000000
[519142.866634] ? mwait_idle (arch/x86/kernel/process.c:732) 
[519142.866638] default_idle (arch/x86/include/asm/paravirt.h:173 arch/x86/kernel/process.c:733) 
[519142.866641] default_idle_call (arch/x86/include/asm/paravirt.h:699 kernel/sched/idle.c:117) 
[519142.866644] do_idle (kernel/sched/idle.c:192 kernel/sched/idle.c:303) 
[519142.866650] cpu_startup_entry (kernel/sched/idle.c:399 (discriminator 1)) 
[519142.866653] rest_init (??:?) 
[519142.866657] arch_call_rest_init+0xa/0x10 
[519142.866662] start_kernel (init/main.c:1139) 
[519142.866665] secondary_startup_64_no_verify (/build/linux-23EX6j/linux-5.18.2/arch/x86/kernel/head_64.S:303) 
[519142.866671]  </TASK>
[519142.866671] ---[ end trace 0000000000000000 ]---
[519143.949321] ------------[ cut here ]------------
[519143.949346] WARNING: CPU: 0 PID: 0 at net/mptcp/subflow.c:858 mptcp_subflow_data_available (net/mptcp/subflow.c:858 net/mptcp/subflow.c:1069 net/mptcp/subflow.c:1176 net/mptcp/subflow.c:1274) 
[519143.949374] Modules linked in: wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel curve25519_x86_64 libcurve25519_generic libchacha intel_rapl_msr intel_rapl_common iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ipt_REJECT nf_reject_ipv4 aesni_intel xt_hl xt_tcpudp crypto_simd cryptd xt_state xt_conntrack evdev nft_compat serio_raw sg nf_tables qemu_fw_cfg virtio_balloon button nfnetlink sch_fq tcp_bbr nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c fuse configfs ip_tables x_tables autofs4 virtio_net sr_mod cdrom net_failover failover virtio_blk ata_generic ata_piix libata crc32c_intel virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring scsi_mod scsi_common
[519143.949533] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.18.0-0.bpo.1-cloud-amd64 #1  Debian 5.18.2-1~bpo11+1
[519143.949539] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
[519143.949543] RIP: 0010:mptcp_subflow_data_available (net/mptcp/subflow.c:858 net/mptcp/subflow.c:1069 net/mptcp/subflow.c:1176 net/mptcp/subflow.c:1274) 
./decode_stacktrace.sh: line 211: ./decodecode: No such file or directory
[519143.949577] RSP: 0018:ffffa8c8c0003a48 EFLAGS: 00010286
[519143.949582] RAX: 00000000d1fc745e RBX: ffff9146b6051b00 RCX: 00000000d1fb83dd
[519143.949586] RDX: 0000000000000557 RSI: 00000000ffff0f7f RDI: f4d25dc7361a5f18
[519143.949589] RBP: ffff914683270000 R08: ffff914685b53788 R09: ffff9146b6051b00
[519143.949592] R10: 0000000000000000 R11: ffff9146b39d2000 R12: ffff914683e86400
[519143.949595] R13: 0000000000000000 R14: ffff914683e86400 R15: ffff9146832700d8
[519143.949599] FS:  0000000000000000(0000) GS:ffff9146bec00000(0000) knlGS:0000000000000000
[519143.949602] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[519143.949605] CR2: 00007f7d8697db60 CR3: 00000000022ae005 CR4: 00000000001706b0
[519143.949613] Call Trace:
[519143.949636]  <IRQ>
[519143.949644] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178) 
[519143.949666] subflow_data_ready (net/mptcp/subflow.c:1357 (discriminator 11)) 
[519143.949672] tcp_data_queue (net/ipv4/tcp_input.c:5062) 
[519143.949681] tcp_rcv_established (include/linux/skbuff.h:2027 include/net/tcp.h:1853 include/net/tcp.h:1926 net/ipv4/tcp_input.c:5482 net/ipv4/tcp_input.c:5981) 
[519143.949687] tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1706) 
[519143.949694] tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2074) 
[519143.949700] ? nf_hook_slow (include/linux/netfilter.h:142 net/netfilter/core.c:620) 
[519143.949707] ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1)) 
[519143.949714] ip_local_deliver_finish (include/linux/rcupdate.h:726 net/ipv4/ip_input.c:234) 
[519143.949720] ip_sublist_rcv_finish (include/net/dst.h:461 net/ipv4/ip_input.c:567) 
[519143.949725] ip_sublist_rcv (net/ipv4/ip_input.c:626) 
[519143.949731] ? ip_sublist_rcv (net/ipv4/ip_input.c:424) 
[519143.949737] ip_list_rcv (net/ipv4/ip_input.c:661) 
[519143.949743] __netif_receive_skb_list_core (net/core/dev.c:5448 net/core/dev.c:5496) 
[519143.949752] netif_receive_skb_list_internal (net/core/dev.c:5550 net/core/dev.c:5639) 
[519143.949758] napi_complete_done (include/linux/list.h:37 include/net/gro.h:431 include/net/gro.h:426 net/core/dev.c:5980) 
[519143.949765] virtnet_poll (drivers/net/virtio_net.c:366 drivers/net/virtio_net.c:1631) virtio_net
[519143.949801] __napi_poll (net/core/dev.c:6413) 
[519143.949808] net_rx_action (net/core/dev.c:6482 net/core/dev.c:6567) 
[519143.949813] ? __napi_schedule (arch/x86/include/asm/irqflags.h:137 net/core/dev.c:5890) 
[519143.949819] __do_softirq (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:559) 
[519143.949826] __irq_exit_rcu (kernel/softirq.c:432 kernel/softirq.c:637) 
[519143.949833] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) 
[519143.949841]  </IRQ>
[519143.949844]  <TASK>
[519143.949847] asm_common_interrupt (/build/linux-23EX6j/linux-5.18.2/arch/x86/include/asm/idtentry.h:636) 
[519143.949852] RIP: 0010:native_safe_halt (arch/x86/include/asm/irqflags.h:52) 
./decode_stacktrace.sh: line 211: ./decodecode: No such file or directory
[519143.949864] RSP: 0018:ffffffffb3e03e90 EFLAGS: 00000282
[519143.949869] RAX: ffffffffb313a360 RBX: ffffffffb3e19940 RCX: 7fffffffffffffff
[519143.949872] RDX: ffffffffb3e19940 RSI: ffffffffb390d946 RDI: ffffffffb38e1840
[519143.949875] RBP: 0000000000000000 R08: 000000cd42e4dffb R09: 0000000000000000
[519143.949886] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[519143.949908] R13: 0000000000000000 R14: ffffffffb3e19120 R15: 0000000000000000
[519143.949914] ? mwait_idle (arch/x86/kernel/process.c:732) 
[519143.949936] default_idle (arch/x86/include/asm/paravirt.h:173 arch/x86/kernel/process.c:733) 
[519143.949942] default_idle_call (arch/x86/include/asm/paravirt.h:699 kernel/sched/idle.c:117) 
[519143.949948] do_idle (kernel/sched/idle.c:192 kernel/sched/idle.c:303) 
[519143.949958] cpu_startup_entry (kernel/sched/idle.c:399 (discriminator 1)) 
[519143.949965] rest_init (??:?) 
[519143.949972] arch_call_rest_init+0xa/0x10 
[519143.949981] start_kernel (init/main.c:1139) 
[519143.949987] secondary_startup_64_no_verify (/build/linux-23EX6j/linux-5.18.2/arch/x86/kernel/head_64.S:303) 
[519143.949996]  </TASK>
[519143.950000] ---[ end trace 0000000000000000 ]---
matttbe commented 2 years ago

Hi @wloot

Sorry for the delay.

It is not very clear what is the issue.

By chance, if you already re-build the packages, do you mind testing on top of our export branch (or netdev) to use the last dev version?

./decode_stacktrace.sh: line 211: ./decodecode: No such file or directory

Apparently, you also need to download decodecode from the kernel source code.

pabeni commented 2 years ago

Hello,

I found out that the dmesg warnings on server side are very similar. I did the script on a MPTCP server with debian packaged kernel(remind not latest one, kernel version is 5.18.2), hope these able to help:

[519142.831986] ------------[ cut here ]------------
[519142.832051] WARNING: CPU: 0 PID: 0 at net/mptcp/protocol.c:693 __mptcp_move_skbs_from_subflow (net/mptcp/protocol

The above warning, and the others reported in the same commet are caused by a DSS mapping not covering the whole, paired skb. That should never happen and in fact we haven't observed this kind of bug since a lot of time (while is was present in the very early MPTCP implementation).

After a quick check I don't see any relevant change between 5.18 and the current devel branch, so I would guess the issue should be observable even in the new kernel, but only the testing would tell for sure.

I'm wondering if TCP fallback and/or preemption are involved.

@wloot: could you please additionally share:

matttbe commented 1 year ago

Hello @wloot

Do you have any updates on the previous questions asked above?

If it is hard to reproduce, it is maybe best to close the ticket and re-open it if we get any new input.

matttbe commented 1 year ago

I suggest to close this ticket.

Feel free to re-open it with the requested info (see above): it would be great to fix it if it has not been fixed in between.