Closed davidjkcho closed 6 years ago
Love this, I personally unpacked the xz in place but great job! Tested on Centos 7.5 (3.10.0-862.3.2.el7.x86_64)
Also seen on CentOS 7.4 and latest Mellanox nvidia-peer-memory_1.0-7.tar.gz: It seems to be the right work around, otherwise would run into the error below:
# /home/pak/nvidia-peer-memory-1.0.7
# make all
/home/pak/nvidia-peer-memory-1.0.7/create_nv.symvers.sh 3.10.0-693.21.1.el7.x86_64
nm: /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/nvidia.ko.xz: File format not recognized
-W- Could not get list of nvidia symbols.
...
If it's the right fix, can someone please commit the workaround?
no, this isn't the right WA, it is changing files on the system (decompressing the modules). we'll change the script to accept also ko.xz modules.
missed the part where you copied the module to local folder, this actually looks good approach.
fixed by #44
Hi, the similar issue happen on Power9, log as:
[user@localhost nvidia-peer-memory-1.0]$ sudo yum localinstall ~/rpmbuild/RPMS/ppc64le/nvidia_peer_memory-1.0-7.ppc64le.rpm Loaded plugins: product-id, search-disabled-repos, subscription-manager This system is registered with an entitlement server, but is not receiving updates. You can use subscription-manager to assign subscriptions. Examining /home/user/rpmbuild/RPMS/ppc64le/nvidia_peer_memory-1.0-7.ppc64le.rpm: nvidia_peer_memory-1.0-7.ppc64le Marking /home/user/rpmbuild/RPMS/ppc64le/nvidia_peer_memory-1.0-7.ppc64le.rpm to be installed Resolving Dependencies --> Running transaction check ---> Package nvidia_peer_memory.ppc64le 0:1.0-7 will be installed --> Finished Dependency Resolution
Dependencies Resolved
Installing: nvidia_peer_memory ppc64le 1.0-7 /nvidia_peer_memory-1.0-7.ppc64le 310 k
Install 1 Package
Total size: 310 k Installed size: 310 k Is this ok [y/d/N]: y Downloading packages: Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : nvidia_peer_memory-1.0-7.ppc64le 1/1 modprobe: ERROR: could not insert 'nv_peer_mem': Unknown symbol in module, or unknown parameter (see dmesg) Verifying : nvidia_peer_memory-1.0-7.ppc64le 1/1
Installed: nvidia_peer_memory.ppc64le 0:1.0-7
Complete! [user@localhost nvidia-peer-memory-1.0]$ ls
"nm -o $nvidia_mod" in create_nv.symvers.sh is looking for .ko but kernel module names on CentOS 7 end with .ko.xz. Thus, it failed to get symbol names.
Below was the change I made to work around.
--- create_nv.symvers.sh.new 2018-05-09 10:38:40.033345119 -0700 +++ create_nv.symvers.sh.old 2018-05-09 10:38:08.114218425 -0700 @@ -77,9 +77,6 @@ if [ ! -e "$nvidia_mod" ]; then continue fi