Mellanox / nv_peer_memory

309 stars 62 forks source link

nvidia-peer-memory-dkms_1.2-0_all.deb : Could not insert 'nv_peer_mem': Invalid argument #113

Open nelsonsilva94 opened 1 year ago

nelsonsilva94 commented 1 year ago

I am facing this problem when trying to install nv_peer-memory-dkms

bisect@bitxo-Super-Server /tmp> sudo dpkg -i nvidia-peer-memory_1.2-0_all.deb nvidia-peer-memory-dkms_1.2-0_all.deb 
(Reading database ... 249611 files and directories currently installed.)
Preparing to unpack nvidia-peer-memory-dkms_1.2-0_all.deb ...
Module nv_peer_mem-1.3 for kernel 5.19.0-42-generic (x86_64).
Before uninstall, this module version was ACTIVE on this kernel.

nv_peer_mem.ko:
 - Uninstallation
   - Deleting from: /lib/modules/5.19.0-42-generic/updates/dkms/
 - Original module
   - No original module was found for this module on this kernel.
   - Use the dkms install command to reinstall any previous module version.

depmod...
Deleting module nv_peer_mem-1.3 completely from the DKMS tree.
Unpacking nvidia-peer-memory-dkms (1.2-0) over (1.2-0) ...
Preparing to unpack nvidia-peer-memory_1.2-0_all.deb ...
Unpacking nvidia-peer-memory (1.2-0) over (1.2-0) ...
Setting up nvidia-peer-memory-dkms (1.2-0) ...
Loading new nv_peer_mem-1.3 DKMS files...
Building for 5.19.0-42-generic
Building initial module for 5.19.0-42-generic
Secure Boot not enabled on this system.
Done.

nv_peer_mem.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.19.0-42-generic/updates/dkms/

depmod...
modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument
dpkg: error processing package nvidia-peer-memory-dkms (--install):
 installed nvidia-peer-memory-dkms package post-installation script subprocess returned error exit status 1
Setting up nvidia-peer-memory (1.2-0) ...
Errors were encountered while processing:
 nvidia-peer-memory-dkms

Any help is welcome.

leavelet commented 12 months ago

You do not need to install nv_peer_memory any more after CUDA 11.4 and driver 470. Use nvidia-peermem instead. It will be automatically installed with nvidia driver if you install mlnx_ofed before GPU driver.