Mellanox / libvma

Linux user space library for network socket acceleration based on RDMA compatible network adaptors
https://www.mellanox.com/products/software/accelerator-software/vma?mtag=vma
Other
573 stars 153 forks source link

verify_qp_creation() QP creation failed on ConnectX-6 IPoIB interface #990

Closed Honggang-LI closed 2 years ago

Honggang-LI commented 2 years ago

verify_qp_creation() QP creation failed on ConnectX-6 IPoIB interface

Issue type

Configuration:

5a3aead0adeb13f891bbe134d51c6698654bd819

$ ibstat CA 'mlx5_0' CA type: MT4124 Number of ports: 1 Firmware version: 20.32.1010 Hardware version: 0 Node GUID: 0x112233447766770d System image GUID: 0x0c42a10300ebe25a Port 1: State: Active Physical state: LinkUp Rate: 200 Base lid: 15 LMC: 0 SM lid: 1 Capability mask: 0x2659ec48 Port GUID: 0x112233447766770e Link layer: InfiniBand


## Actual behavior:

Pid: 81333 Tid: 81333 VMA DEBUG: utils:776:get_local_ll_addr() ifname=ib0 un-aliased-ifname=3 l2_addr_path=ib0 l2-addr=/sys/class/net/ib0/address (addr-bytes_len=20) Pid: 81333 Tid: 81333 VMA DEBUG: utils:785:get_local_ll_addr() found IB UC address 0000:0909:FE80:0000:0000:0000:1122:3344:7766:770E for interface ib0 Pid: 81333 Tid: 81333 VMA DEBUG: L2_addr[0x7ffe587c1950]:101:extract_qpn() qpn = 0x909 Pid: 81333 Tid: 81333 VMA FINE: ENTER: ioctl(fd=7, request=-1072162047) Pid: 81333 Tid: 81333 VMA FINE: EXIT: ioctl() returned with 0 Pid: 81333 Tid: 81333 VMA FINE: ENTER: ioctl(fd=7, request=-1072162047) Pid: 81333 Tid: 81333 VMA FINE: EXIT: ioctl() returned with 0 Pid: 81333 Tid: 81333 VMA DEBUG: ndv[0xe5f510]:1737:verify_qp_creation() QP creation failed on interface ib0 (errno=22 Invalid argument), Traffic will not be offloaded Pid: 81333 Tid: 81333 VMA DEBUG: utils:1143:validate_user_has_cap_net_raw_privliges() successfully got cap_net_raw permissions. Effective=FFFFFFFF Permitted=FFFFFFFF Pid: 81333 Tid: 81333 VMA WARNING: Pid: 81333 Tid: 81333 VMA WARNING: Interface ib0 will not be offloaded. Pid: 81333 Tid: 81333 VMA WARNING: VMA was not able to create QP for this device (errno = 22). Pid: 81333 Tid: 81333 VMA WARNING:


## Expected behavior:
libvma works over connectx-6.

## Steps to reproduce:
```shell
$ ./configure --enable-opt-log=none --prefix=/opt/upstream/libvma
$ make
$ VMA_TRACELEVEL=6 LD_PRELOAD=/root/upstream/libvma/src/vma/.libs/libvma.so /tmp/sockperf sr
igor-ivanov commented 2 years ago

Hello, @Honggang-LI ipoib is not supported in li ma now. So it looks as valid warning.

Honggang-LI commented 2 years ago

Hello, @Honggang-LI ipoib is not supported in li ma now. So it looks as valid warning.

Do you imply connectx-6 RoCE will work?

QP creation failed because ibv_create_qp_ex was called with source_qpn. How to call ibv_create_qp_ex with source_qpn?

Thanks

Honggang-LI commented 2 years ago

Hello, @Honggang-LI ipoib is not supported in li ma now. Why ipoib was disabled?

igor-ivanov commented 2 years ago

QP creation failed because ibv_create_qp_ex was called with source_qpn. How to call ibv_create_qp_ex with source_qpn?

Thanks

https://github.com/Mellanox/libvma/blob/48ec52f1b70ea5754b2719c58d75aa2d259d04b4/src/vma/ib/base/verbs_extra.h#L122-L134

https://docs.nvidia.com/networking/display/VMAv952/Changes+and+New+Features

IPoIB is temporarily unavailable when working with MLNX_OFED v5.1 and above.
Honggang-LI commented 2 years ago

https://docs.nvidia.com/networking/display/VMAv952/Changes+and+New+Features

IPoIB is temporarily unavailable when working with MLNX_OFED v5.1 and above.

It sounds connectx-6 kernel space driver does not support source_qpn. I tried rhel-8.2 inbox rdma stack and rhel-8.2 + MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.2-x86_64.iso . Tested the connectx-6 device with libibverbs-utils, it works as expected . But libvma over connectx-6 IPoIB never worked.

DanielLibenson commented 2 years ago

Hi Honggang,

IPoIB is not working since VMA v9.1.1 (OFED v5.1) and above. Legacy implementation of IPoIB was based on mlnx libraries (experimental verbs) and current implementation is based on rdma-core. Due to driver issue, IPoIB is not working with rdma-core (only for user space).

Regarding VMA legacy implementation with mlnx libraries: MLNX_OFED-4.9 and below: supported by default. MLNX_OFED-5.0 need to provide --mlnx-libs option to the installation script (rdma-core is the default) MLNX_OFED-5.1 and above: is not supported.

Daniel

igor-ivanov commented 2 years ago

@Honggang-LI can it be closed?

Honggang-LI commented 2 years ago

@Honggang-LI can it be closed?

Thanks for help. Close it.