free5gc / gtp5g

GTP-U Linux Kernel Module
GNU General Public License v2.0
53 stars 63 forks source link

Kernel panic when running 10x ran.sh in parallel #81

Open linouxis9 opened 1 year ago

linouxis9 commented 1 year ago

Hi all,

I'm having a kernel panic when running 10x this script in parallel (with 10 UEs, 10 different gNB IPs, 10 differents GTP-U interfaces): https://github.com/free5gc/libgtp5gnl/blob/master/script/ran.sh but using the go binaries from https://github.com/free5gc/go-gtp5gnl (with a tweak here: https://github.com/free5gc/go-gtp5gnl/blob/4f36b49ab7f7f90632b0981aa832121438e5a243/cmd/gogtp5g-link/main.go#L72 so an interface bind only to a single specified IP address instead of binding to all IP addresses).

Adding a small sleep of 20 ms before launching the 2nd, 3rd.. scripts workaround the issue.

According to the kernel panic, issue seems to lie inside gtp5g_genl_add_pdr, you'll find the Kernel panic logs at the end of this post. If I can be of any help, don't hesitate, thank you!! Also quick question, is it possible to have multiple gtp5g interface on the same IP address/port? Thanks!

[   57.441597] BUG: kernel NULL pointer dereference, address: 0000000000000080
[   57.442989] #PF: supervisor write access in kernel mode
[   57.443983] #PF: error_code(0x0002) - not-present page
[   57.444965] PGD 802150067 P4D 802150067 PUD 819cfd067 PMD 0
[   57.446041] Oops: 0002 [#1] SMP NOPTI
[   57.446749] CPU: 20 PID: 2009 Comm: app Tainted: G           OE     5.4.0-152-generic #169-Ubuntu
[   57.448424] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[   57.450438] RIP: 0010:gtp5g_genl_add_pdr+0x185/0x270 [gtp5g]
[   57.451513] Code: 01 00 00 be 20 0b 00 00 e8 88 0b f9 e3 49 89 c5 48 85 c0 0f 84 e2 00 00 00 49 8b 54 24 10 b8 01 00 00 00 48 8d ba 80 00 00 00 <f0> 0f c1 82 80 00 00 00 85 c0 74 69 78 5b 83 c0 01 78 56 49 8b 44
[   57.455015] RSP: 0018:ffffae83c797ba10 EFLAGS: 00010286
[   57.456013] RAX: 0000000000000001 RBX: ffffae83c797baa8 RCX: 0000000000000000
[   57.457366] RDX: 0000000000000000 RSI: ffffffffc090cc58 RDI: 0000000000000080
[   57.458716] RBP: ffffae83c797ba50 R08: ffff9d421fa35140 R09: ffff9d3a1f406d80
[   57.460064] R10: ffff9d421ab92e00 R11: 0000000000000011 R12: ffff9d3a0360d8c0
[   57.461414] R13: ffff9d421ab92e00 R14: 0000000000000000 R15: ffff9d421ab90c00
[   57.462763] FS:  00007f9d5cff9700(0000) GS:ffff9d421fa00000(0000) knlGS:0000000000000000
[   57.464290] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   57.465381] CR2: 0000000000000080 CR3: 0000000802254005 CR4: 00000000007606e0
[   57.466760] PKRU: 55555554
[   57.467292] Call Trace:
[   57.467779]  genl_family_rcv_msg+0x1b9/0x470
[   57.468601]  genl_rcv_msg+0x4c/0xa0
[   57.469278]  ? _cond_resched+0x19/0x30
[   57.470002]  ? genl_family_rcv_msg+0x470/0x470
[   57.470853]  netlink_rcv_skb+0x50/0x120
[   57.471588]  genl_rcv+0x29/0x40
[   57.472196]  netlink_unicast+0x1a8/0x250
[   57.472949]  netlink_sendmsg+0x240/0x480
[   57.473706]  ? __check_object_size+0x4d/0x150
[   57.474541]  sock_sendmsg+0x65/0x70
[   57.475215]  ____sys_sendmsg+0x212/0x280
[   57.476660]  ___sys_sendmsg+0x88/0xd0
[   57.478060]  ? iput+0x148/0x210
[   57.479356]  ? _cond_resched+0x19/0x30
[   57.480738]  ? get_max_files+0x20/0x20
[   57.482095]  __sys_sendmsg+0x5c/0xa0
[   57.483413]  __x64_sys_sendmsg+0x1f/0x30
[   57.484786]  do_syscall_64+0x57/0x190
[   57.486107]  entry_SYSCALL_64_after_hwframe+0x5c/0xc1
[   57.487677] RIP: 0033:0x40436e
[   57.488858] Code: 48 89 6c 24 38 48 8d 6c 24 38 e8 0d 00 00 00 48 8b 6c 24 38 48 83 c4 40 c3 cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48
[   57.494126] RSP: 002b:000000c000199740 EFLAGS: 00000206 ORIG_RAX: 000000000000002e
[   57.496163] RAX: ffffffffffffffda RBX: 0000000000000078 RCX: 000000000040436e
[   57.498091] RDX: 0000000000000000 RSI: 000000c000199870 RDI: 0000000000000078
[   57.500011] RBP: 000000c000199780 R08: 0000000000000000 R09: 0000000000000000
[   57.501921] R10: 0000000000000000 R11: 0000000000000206 R12: 000000c000199938
[   57.503831] R13: 0000000000000000 R14: 000000c0005816c0 R15: 000000c000067800
[   57.505731] Modules linked in: vrf sctp 8021q garp mrp stp llc vmw_vsock_vmci_transport vsock dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua binfmt_misc intel_rapl_msr vmw_balloon intel_rapl_common isst_if_mbox_msr isst_if_common joydev input_leds nfit rapl serio_raw vmw_vmci mac_hid sch_fq_codel gtp5g(OE) udp_tunnel msr ramoops reed_solomon efi_pstore ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid vmwgfx ttm crct10dif_pclmul drm_kms_helper crc32_pclmul ghash_clmulni_intel syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops crypto_simd mptspi cryptd mptscsih glue_helper bnxt_en psmouse drm ahci mptbase vmxnet3 i2c_piix4 libahci scsi_transport_spi pata_acpi
[   57.524746] CR2: 0000000000000080
[   57.526104] ---[ end trace d4fa568a26f72f9a ]---
[   57.527702] RIP: 0010:gtp5g_genl_add_pdr+0x185/0x270 [gtp5g]
[   57.529497] Code: 01 00 00 be 20 0b 00 00 e8 88 0b f9 e3 49 89 c5 48 85 c0 0f 84 e2 00 00 00 49 8b 54 24 10 b8 01 00 00 00 48 8d ba 80 00 00 00 <f0> 0f c1 82 80 00 00 00 85 c0 74 69 78 5b 83 c0 01 78 56 49 8b 44
[   57.535267] RSP: 0018:ffffae83c797ba10 EFLAGS: 00010286
[   57.537051] RAX: 0000000000000001 RBX: ffffae83c797baa8 RCX: 0000000000000000
[   57.539202] RDX: 0000000000000000 RSI: ffffffffc090cc58 RDI: 0000000000000080
[   57.541344] RBP: ffffae83c797ba50 R08: ffff9d421fa35140 R09: ffff9d3a1f406d80
[   57.543531] R10: ffff9d421ab92e00 R11: 0000000000000011 R12: ffff9d3a0360d8c0
[   57.545695] R13: ffff9d421ab92e00 R14: 0000000000000000 R15: ffff9d421ab90c00
[   57.547854] FS:  00007f9d5cff9700(0000) GS:ffff9d421fa00000(0000) knlGS:0000000000000000
[   57.550205] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   57.552138] CR2: 0000000000000080 CR3: 0000000802254005 CR4: 00000000007606e0
[   57.554361] PKRU: 55555554
ianchen0119 commented 1 year ago

Hi @linouxis9

Would you please help to provide more information? for example:

linouxis9 commented 1 year ago

Hi @ianchen0119,

Thank you for your message! I was indeed using go-gtp5gnl to trigger the kernel panic, and not libgtp5gnl. I've used this script https://github.com/free5gc/libgtp5gnl/blob/master/script/ran.sh from libgtp5gnl, but by replacing inside the scripts the gtp5g-link/gtp5g-tunnel binaries from libgtp5gnl, with the gtp5g-link/gtp5g-tunnel go binaries from go-gtp5gnl here: https://github.com/free5gc/go-gtp5gnl/tree/main/cmd.

I had done my testing on gtp5g's commit 3f425930aa6e972f3f4c5f78b7bdaf0518574101.

Thanks a lot!