Mellanox / nv_peer_memory

292 stars 60 forks source link

rpmbuild --rebuild /tmp/nvidia_peer_memory-1.2-0.src.rpm fails on CentOS7 #94

Closed yug0slav closed 2 years ago

yug0slav commented 2 years ago

Building source rpm for nvidia_peer_memory...

Built: /tmp/nvidia_peer_memory-1.2-0.src.rpm

To install run on RPM based OS:

rpmbuild --rebuild /tmp/nvidia_peer_memory-1.2-0.src.rpm

# rpm -ivh <path to generated binary rpm file>

rpmbuild --rebuild /tmp/nvidia_peer_memory-1.2-0.src.rpm Installing /tmp/nvidia_peer_memory-1.2-0.src.rpm Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.kaBNqq

RPM build errors: Bad exit status from /var/tmp/rpm-tmp.3dxLME (%build)

drossetti commented 2 years ago

same as #95, please note the new requirement:

Please note that to build correctly, a MLNX_OFED carrying the Peer-direct fix for the bug "Peer-direct patch may cause deadlock due to lock inversion" (tracked by the Internal Ref. #2696789) is required, for example MLNX_OFED 5.3-1.0.0.1.43.
yug0slav commented 2 years ago

I am not following... was the bug fixed in 5.3-1.0.0.1.43? I am on 5.4-1.0.3.0 attempting to build/install nvidia_peer_memory-1.2.

yug0slav commented 2 years ago

resolved in MLNX_OFED_LINUX-5.4-3.0.3.0

erwincoumans commented 1 year ago

So nv_peer_memory can't be used with ConnectX-3 cards (even though the hardware supports it)?

Note: MLNX_OFED 4.9-x LTS should be used by customers who would like to utilize one of the following:
NVIDIA ConnectX-3 Pro
NVIDIA ConnectX-3
NVIDIA Connect-IB
RDMA experimental verbs library (mlnx_lib)
OSs based on kernel version lower than 3.10
Note: All of the above are not available on MLNX_OFED 5.x branch.

Note: MLNX_OFED 5.4-x LTS should be used by customers who would like to utilize NVIDIA ConnectX-4 onwards adapter cards and keep using stable 5.4-x deployment and get:
Critical bug fixes
Support for new major OSs