Mellanox / nv_peer_memory

292 stars 60 forks source link

Failed to build nv_peer_mem on ubuntu 20.04 #70

Closed adrianchiris closed 3 years ago

adrianchiris commented 4 years ago

Project cloned from master: a5cbf195745c7a6f9a8e2713a274f55bc6b5f223 Compilation fails on modpost

root@13481535799f:/var/lib/dkms/nvidia-peer-memory/1.0/build# make
/var/lib/dkms/nvidia-peer-memory/1.0/build/create_nv.symvers.sh 5.4.0-29-generic
Getting symbol versions from /lib/modules/5.4.0-29-generic/updates/dkms/nvidia.ko ...
Created: /var/lib/dkms/nvidia-peer-memory/1.0/build/nv.symvers
Found /usr/src/nvidia-440.64/nvidia/nv-p2p.h
/bin/cp -f /usr/src/nvidia-440.64/nvidia/nv-p2p.h /var/lib/dkms/nvidia-peer-memory/1.0/build/nv-p2p.h
cp -rf /usr/src/ofa_kernel/5.4.0-29-generic/Module.symvers .
cat nv.symvers >> Module.symvers
make -C /lib/modules/5.4.0-29-generic/build  M=/var/lib/dkms/nvidia-peer-memory/1.0/build modules
make[1]: Entering directory '/usr/src/linux-headers-5.4.0-29-generic'
  CC [M]  /var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.o
/var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.c:80:9: note: #pragma message: Enable nvidia_p2p_dma_map_pages support
   80 | #pragma message("Enable nvidia_p2p_dma_map_pages support")
      |         ^~~~~~~
  Building modules, stage 2.
  MODPOST 1 modules
FATAL: parse error in symbol dump file
make[2]: *** [scripts/Makefile.modpost:94: __modpost] Error 1
make[1]: *** [Makefile:1632: modules] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-29-generic'
make: *** [Makefile:60: all] Error 2

The following fixed the issue, however im not sure how it would affect other distros

diff --git a/create_nv.symvers.sh b/create_nv.symvers.sh
index 453aa64..109f24d 100755
--- a/create_nv.symvers.sh
+++ b/create_nv.symvers.sh
@@ -118,7 +118,7 @@ do
                file=$(echo $line | cut -f1 -d: | sed -r -e 's@\./@@' -e 's@.ko(\S)*@@' -e "s@$PWD/@@")
                crc=$(echo $line | cut -f2 -d: | cut -f1 -d" ")
                sym=$(echo $line | cut -f2 -d: | cut -f3 -d" " | sed -e 's/__crc_//g')
-               echo -e "0x$crc\t$sym\t$file" >> $MOD_SYMVERS
+               echo -e "0x$crc\t$sym\t$file\tEXPORT_SYMBOL\t" >> $MOD_SYMVERS
        done < <(nm -o $nvidia_mod | grep -E "$modules_pat")

        echo "Created: ${MOD_SYMVERS}"
adrianchiris commented 4 years ago

@ferasd

jamieNguyenNVIDIA commented 4 years ago

This patch does appear to work on both Ubuntu 16.04 and Ubuntu 18.04 as well.

Ubuntu 16.04 snippet (first shows the changed code):

+ echo -e '0x000000007e399228\tnvidia_p2p_put_pages\t/lib/modules/4.4.0-157-generic/updates/dkms/nvidia_384\tEXPORT_SYMBOL\t'
+ read -r line
+ echo '/lib/modules/4.4.0-157-generic/updates/dkms/nvidia_384.ko:00000000000092e0 T nvidia_p2p_destroy_mapping'
+ grep -q __crc_nvidia_p2p_
+ '[' 1 '!=' 0 ']'
+ continue
+ read -r line
+ echo '/lib/modules/4.4.0-157-generic/updates/dkms/nvidia_384.ko:00000000000096f0 T nvidia_p2p_dma_map_pages'
+ grep -q __crc_nvidia_p2p_
+ '[' 1 '!=' 0 ']'
+ continue
+ read -r line
+ echo '/lib/modules/4.4.0-157-generic/updates/dkms/nvidia_384.ko:0000000000009460 T nvidia_p2p_dma_unmap_pages'
+ grep -q __crc_nvidia_p2p_
+ '[' 1 '!=' 0 ']'
+ continue
+ read -r line
+ echo '/lib/modules/4.4.0-157-generic/updates/dkms/nvidia_384.ko:0000000000008eb0 T nvidia_p2p_free_dma_mapping'
+ grep -q __crc_nvidia_p2p_
+ '[' 1 '!=' 0 ']'
+ continue
+ read -r line
+ echo '/lib/modules/4.4.0-157-generic/updates/dkms/nvidia_384.ko:0000000000008e50 T nvidia_p2p_free_page_table'
+ grep -q __crc_nvidia_p2p_
+ '[' 1 '!=' 0 ']'
+ continue
+ read -r line
+ echo '/lib/modules/4.4.0-157-generic/updates/dkms/nvidia_384.ko:0000000000008ee0 T nvidia_p2p_get_pages'
+ grep -q __crc_nvidia_p2p_
+ '[' 1 '!=' 0 ']'
+ continue
+ read -r line
+ echo '/lib/modules/4.4.0-157-generic/updates/dkms/nvidia_384.ko:00000000000095c0 T nvidia_p2p_init_mapping'
+ grep -q __crc_nvidia_p2p_
+ '[' 1 '!=' 0 ']'
+ continue
+ read -r line
+ echo '/lib/modules/4.4.0-157-generic/updates/dkms/nvidia_384.ko:0000000000009380 T nvidia_p2p_put_pages'
+ grep -q __crc_nvidia_p2p_
+ '[' 1 '!=' 0 ']'
+ continue
+ read -r line
+ echo 'Created: /var/lib/dkms/nvidia-peer-memory/1.0/build/nv.symvers'
Created: /var/lib/dkms/nvidia-peer-memory/1.0/build/nv.symvers
+ exit 0
Found /usr/src/nvidia-384-384.183/nvidia/nv-p2p.h
/bin/cp -f /usr/src/nvidia-384-384.183/nvidia/nv-p2p.h /var/lib/dkms/nvidia-peer-memory/1.0/build/nv-p2p.h
cp -rf /usr/src/ofa_kernel/4.4.0-157-generic/Module.symvers .
cat nv.symvers >> Module.symvers
make -C /lib/modules/4.4.0-157-generic/build  M=/var/lib/dkms/nvidia-peer-memory/1.0/build modules
make[1]: Entering directory '/usr/src/linux-headers-4.4.0-157-generic'
  CC [M]  /var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.o
/var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.c:80:9: note: #pragma message: Enable nvidia_p2p_dma_map_pages support
 #pragma message("Enable nvidia_p2p_dma_map_pages support")
         ^
  Building modules, stage 2.
  MODPOST 1 modules
  LD [M]  /var/lib/dkms/nvidia-peer-memory/1.0/build/nv_peer_mem.ko
make[1]: Leaving directory '/usr/src/linux-headers-4.4.0-157-generic'

Ubuntu 18.04 snippet (first shows the changed code):

+ echo -e '0x000000000000b4b0\tnvidia_p2p_unregister_rsync_driver\t/lib/modules/5.4.0-40-generic/updates/dkms/nvidia\tEXPORT_SYMBOL\t'
+ read -r line
+ echo 'Created: /usr/src/nvidia-peer-memory-1.0/nv.symvers'
Created: /usr/src/nvidia-peer-memory-1.0/nv.symvers
+ exit 0
Found /usr/src/nvidia-450.51.05//nvidia/nv-p2p.h
/bin/cp -f /usr/src/nvidia-450.51.05//nvidia/nv-p2p.h /usr/src/nvidia-peer-memory-1.0/nv-p2p.h
cp -rf /usr/src/ofa_kernel/5.4.0-40-generic/Module.symvers .
cat nv.symvers >> Module.symvers
make -C /lib/modules/5.4.0-40-generic/build  M=/usr/src/nvidia-peer-memory-1.0 modules
make[1]: Entering directory '/usr/src/linux-headers-5.4.0-40-generic'
  CC [M]  /usr/src/nvidia-peer-memory-1.0/nv_peer_mem.o
/usr/src/nvidia-peer-memory-1.0/nv_peer_mem.c:80:9: note: #pragma message: Enable nvidia_p2p_dma_map_pages support
   80 | #pragma message("Enable nvidia_p2p_dma_map_pages support")
      |         ^~~~~~~
  Building modules, stage 2.
  MODPOST 1 modules
  LD [M]  /usr/src/nvidia-peer-memory-1.0/nv_peer_mem.ko
make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-40-generic'
adrianchiris commented 4 years ago

so i guess this needs to be validated on a centos and it can be pushed as a fix (dont have a setup unfortunately or else i would have proposed a PR)

jamieNguyenNVIDIA commented 4 years ago

This looks good on CentOS 7(.8) as well. I cut out some of it because of the verbosity, but here's the output when running rpmbuild. The first line again shows the changed code in create_nv.symvers.sh.

+ echo -e '0x000000004c9ba34e\tnvidia_p2p_destroy_mapping\tnvidia\tEXPORT_SYMBOL\t'
+ read -r line
+ echo 'nvidia.ko:00000000683ef646 A __crc_nvidia_p2p_dma_map_pages'
+ grep -q __crc_nvidia_p2p_
+ crc_found=1
++ echo nvidia.ko:00000000683ef646 A __crc_nvidia_p2p_dma_map_pages
++ cut -f1 -d:
++ sed -r -e 's@\./@@' -e 's@.ko(\S)*@@' -e s@/home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0/@@
+ file=nvidia
++ echo nvidia.ko:00000000683ef646 A __crc_nvidia_p2p_dma_map_pages
++ cut -f2 -d:
++ cut -f1 '-d '
+ crc=00000000683ef646
++ echo nvidia.ko:00000000683ef646 A __crc_nvidia_p2p_dma_map_pages
++ cut -f2 -d:
++ cut -f3 '-d '

<SNIP>

+ read -r line
+ echo 'Created: /home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv.symvers'
Created: /home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv.symvers
+ exit 0
Found /usr/src/nvidia-450.51.05//nvidia/nv-p2p.h
/bin/cp -f /usr/src/nvidia-450.51.05//nvidia/nv-p2p.h /home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv-p2p.h
cp -rf /usr/src/ofa_kernel/default/Module.symvers .
cat nv.symvers >> Module.symvers
make -C /lib/modules/3.10.0-1127.13.1.el7.x86_64/build  M=/home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0 modules
make[1]: Entering directory `/usr/src/kernels/3.10.0-1127.13.1.el7.x86_64'
  CC [M]  /home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.o
/home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.c:80:9: note: #pragma message: Enable nvidia_p2p_dma_map_pages support
 #pragma message("Enable nvidia_p2p_dma_map_pages support")
         ^
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.mod.o
  LD [M]  /home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0/nv_peer_mem.ko
make[1]: Leaving directory `/usr/src/kernels/3.10.0-1127.13.1.el7.x86_64'
+ exit 0
Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.ZmOPwV
+ umask 022
+ cd /home/lab/rpmbuild/BUILD
+ '[' /home/lab/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-9.x86_64 '!=' / ']'
+ rm -rf /home/lab/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-9.x86_64
++ dirname /home/lab/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-9.x86_64

<SNIP>

+ chmod -R o+w /home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0
+ rm -rf /home/lab/rpmbuild/BUILD/nvidia_peer_memory-1.0
+ test x/home/lab/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-9.x86_64 '!=' x
+ rm -rf /home/lab/rpmbuild/BUILDROOT/nvidia_peer_memory-1.0-9.x86_64
+ exit 0
Executing(--clean): /bin/sh -e /var/tmp/rpm-tmp.5qJInr
+ umask 022
+ cd /home/lab/rpmbuild/BUILD
+ rm -rf nvidia_peer_memory-1.0
+ exit 0
adrianchiris commented 4 years ago

great @jamieNguyenNVIDIA , thanks for checking this out. ill push a PR for it then

alaahl commented 3 years ago

Thanks guys!