Xilinx-CNS / onload

OpenOnload high performance user-level network stack
Other
587 stars 97 forks source link

trader_onload_ds_efvi ERROR #120

Open augustjjlin opened 1 year ago

augustjjlin commented 1 year ago

I run exchange program in server 1 with command : onload -p latency-best ./exchange "enp1f1"

And run trader_onload_ds_efvi program in another server with command onload -p latency-best ./trader_onload_ds_efvi "ens1f0" "10.122.116.100"

Server 1 (exchange) shows following message:

oo:exchange[439119]: Using Onload 8.0.0.34 [7]
oo:exchange[439119]: Copyright 2019-2022 Xilinx, 2006-2019 Solarflare Communications, 2002-2005 Level 5 Networks
Waiting for client to connect
Accepted client connection
Starting event loop
Client disconnected

Server 2 (trader_onload_ds_efvi) shows following message:

oo:trader_onload_ds[792275]: Using Onload 8.0.0.34 [1]
oo:trader_onload_ds[792275]: Copyright 2019-2022 Xilinx, 2006-2019 Solarflare Communications, 2002-2005 Level 5 Networks
Killed

Which means trader_onload_ds_efvi got killed immediately when connecting with exchange

And I use dmesg to check to log, it shows:

[Thu Dec  8 16:40:25 2022] BUG: kernel NULL pointer dereference, address: 0000000000000060
[Thu Dec  8 16:40:25 2022] #PF: supervisor read access in kernel mode
[Thu Dec  8 16:40:25 2022] #PF: error_code(0x0000) - not-present page
[Thu Dec  8 16:40:25 2022] PGD e30e64067 P4D e30e64067 PUD 0
[Thu Dec  8 16:40:25 2022] Oops: 0000 [#8] SMP NOPTI
[Thu Dec  8 16:40:25 2022] CPU: 0 PID: 782472 Comm: trader_onload_d Tainted: G      D W  OE     5.4.209 #1
[Thu Dec  8 16:40:25 2022] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 05/24/2021
[Thu Dec  8 16:40:25 2022] RIP: 0010:efrm_client_get_by_dev+0x13a/0x170 [sfc_resource]
[Thu Dec  8 16:40:25 2022] Code: 41 5d 41 5e 41 5f 5d c3 48 8b 3d 89 cb 00 00 48 81 c7 00 02 00 00 e8 75 78 a8 f1 4c 89 e7 e8 8d 39 23 f1 49 8b 85 58 05 00 00 <48> 81 78 60 00 bc d5 b2 74 10 5b b8 ed ff ff ff 41 5c 41 5d 41 5e
[Thu Dec  8 16:40:25 2022] RSP: 0018:ffffbf53c0dffcc0 EFLAGS: 00010202
[Thu Dec  8 16:40:25 2022] RAX: 0000000000000000 RBX: ffffbf53c0dffd28 RCX: 000000000177c044
[Thu Dec  8 16:40:25 2022] RDX: 000000000177c043 RSI: 602eb35664355ae7 RDI: 00000000000350c0
[Thu Dec  8 16:40:25 2022] RBP: ffffbf53c0dffce8 R08: ffff9eae3fc350c0 R09: ffff9eae39807480
[Thu Dec  8 16:40:25 2022] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9ea954b1da80
[Thu Dec  8 16:40:25 2022] R13: ffff9eae27c7a000 R14: 0000000000000000 R15: ffffffffc06788e0
[Thu Dec  8 16:40:25 2022] FS:  00007fa10cac1340(0000) GS:ffff9eae3fc00000(0000) knlGS:0000000000000000
[Thu Dec  8 16:40:25 2022] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Thu Dec  8 16:40:25 2022] CR2: 0000000000000060 CR3: 0000000ec1f7a002 CR4: 00000000007606f0
[Thu Dec  8 16:40:25 2022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Thu Dec  8 16:40:25 2022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Thu Dec  8 16:40:25 2022] PKRU: 55555554
[Thu Dec  8 16:40:25 2022] Call Trace:
[Thu Dec  8 16:40:25 2022]  efrm_client_get+0x55/0x80 [sfc_resource]
[Thu Dec  8 16:40:25 2022]  pd_rm_alloc+0x3f/0x1e0 [sfc_char]
[Thu Dec  8 16:40:25 2022]  efch_resource_alloc+0x126/0x2b0 [sfc_char]
[Thu Dec  8 16:40:25 2022]  ioctl_resource_alloc+0x45/0x90 [sfc_char]
[Thu Dec  8 16:40:25 2022]  ci_char_fop_ioctl+0x3f/0xb0 [sfc_char]
[Thu Dec  8 16:40:25 2022]  do_vfs_ioctl+0x407/0x670
[Thu Dec  8 16:40:25 2022]  ? __do_sys_newstat+0x51/0x80
[Thu Dec  8 16:40:25 2022]  ksys_ioctl+0x67/0x90
[Thu Dec  8 16:40:25 2022]  __x64_sys_ioctl+0x1a/0x20
[Thu Dec  8 16:40:25 2022]  do_syscall_64+0x64/0x210
[Thu Dec  8 16:40:25 2022]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Thu Dec  8 16:40:25 2022] RIP: 0033:0x7fa10cbd93ab
[Thu Dec  8 16:40:25 2022] Code: 0f 1e fa 48 8b 05 e5 7a 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b5 7a 0d 00 f7 d8 64 89 01 48
[Thu Dec  8 16:40:25 2022] RSP: 002b:00007ffc27311468 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[Thu Dec  8 16:40:25 2022] RAX: ffffffffffffffda RBX: 00007fa10cac0740 RCX: 00007fa10cbd93ab
[Thu Dec  8 16:40:25 2022] RDX: 00007ffc27311520 RSI: 0000000000000052 RDI: 0000000000000008
[Thu Dec  8 16:40:25 2022] RBP: 0000000000000000 R08: 0000000000000000 R09: 0034332e302e302e
[Thu Dec  8 16:40:25 2022] R10: 00007fa10cce2561 R11: 0000000000000246 R12: 0000000000000008
[Thu Dec  8 16:40:25 2022] R13: 00007fa10cac06c0 R14: 00007ffc27311520 R15: 0000000000000052
[Thu Dec  8 16:40:25 2022] Modules linked in: binfmt_misc onload(OE) sfc_char(OE) sfc_resource(OE) cmdlinepart sfc(OE) virtual_bus(OE) sfc_driverlink(OE) mdio mtd ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs 8021q garp mrp stp llc nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common isst_if_common nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm ipmi_ssif rapl intel_cstate hpilo ipmi_si acpi_tad mei_me mei ioatdma acpi_power_meter mac_hid sch_fq_codel ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mgag200 crct10dif_pclmul drm_vram_helper crc32_pclmul ttm ghash_clmulni_intel drm_kms_helper uas aesni_intel syscopyarea igb sysfillrect sysimgblt fb_sys_fops crypto_simd cryptd glue_helper dca i2c_algo_bit usb_storage drm ahci lpc_ich libahci wmi [last unloaded: virtual_bus]
[Thu Dec  8 16:40:25 2022] CR2: 0000000000000060
[Thu Dec  8 16:40:25 2022] ---[ end trace 582e5f2da40475b4 ]---

May I know if I did anything wrong or the program has bugs ? Thanks!

abower-amd commented 1 year ago

Hello @augustjjlin,

I see you are using an official release of OpenOnload. If you are also using a Xilinx NIC you are welcome to raise a query with the support team at support-nic@xilinx.com.

If you are using AF_XDP then I suggest building the latest Onload code in this repository if you would like to get help from users of this repository.

Thanks!