Mellanox / nv_peer_memory

292 stars 60 forks source link

Error! Bad return status for module build on kernel: 4.15.0-137-generic (x86_64) #85

Open thuzhf opened 3 years ago

thuzhf commented 3 years ago

When I try to install nvidia-peer-memory-dkms_1.1-0_all.deb, it shows:

Selecting previously unselected package nvidia-peer-memory-dkms.
(Reading database ... 130743 files and directories currently installed.)
Preparing to unpack nvidia-peer-memory-dkms_1.1-0_all.deb ...
Unpacking nvidia-peer-memory-dkms (1.1-0) ...
Setting up nvidia-peer-memory-dkms (1.1-0) ...
Loading new nv_peer_mem-1.1.0 DKMS files...
It is likely that 5.4.0-47-generic belongs to a chroot's host
Building for 4.15.0-137-generic and 5.4.0-47-generic
Building initial module for 4.15.0-137-generic
Error! Bad return status for module build on kernel: 4.15.0-137-generic (x86_64)
Consult /var/lib/dkms/nv_peer_mem/1.1.0/build/make.log for more information.

And /var/lib/dkms/nv_peer_mem/1.1.0/build/make.log's content is as follows:

DKMS make.log for nv_peer_mem-1.1.0 for kernel 4.15.0-137-generic (x86_64)
Tue Mar  9 07:07:34 UTC 2021
INFO: Building with MLNX_OFED from: /usr/src/ofa_kernel/default
awk: cannot open nvidia_peer_memory.spec (No such file or directory)
/var/lib/dkms/nv_peer_mem/1.1.0/build/create_nv.symvers.sh 4.15.0-137-generic
-E- Cannot locate nvidia modules!
CUDA driver must be installed before installing this package!
Makefile:109: recipe for target 'gen_nv_symvers' failed
make: *** [gen_nv_symvers] Error 1

But I do all the operations according to your readme file, how can I install this deb successfully? I'm curious why it says 'It is likely that 5.4.0-47-generic belongs to a chroot's host' and 'Building for 4.15.0-137-generic' since my kernel version is not 4.15.0-137-generic.

My OS info (from docker container):

uname -a:
Linux 6314861f2686 5.4.0-47-generic #51~18.04.1-Ubuntu SMP Sat Sep 5 14:35:50 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

lsb_release -a:
LSB Version:    core-9.20170808ubuntu1-noarch:printing-9.20170808ubuntu1-noarch:security-9.20170808ubuntu1-noarch
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.5 LTS
Release:        18.04
Codename:       bionic
BaldStrong commented 3 years ago

I also encountered this problem, all InfiniBand drivers on the 4.15.0-137 kernel could not be installed, while everything was fine on the 4.15.0-136 kernel.

BaldStrong commented 3 years ago

I also encountered this problem, all InfiniBand drivers on the 4.15.0-137 kernel could not be installed, while everything was fine on the 4.15.0-136 kernel.

Error disappears after lowering kernel version.