Mellanox / nv_peer_memory

305 stars 61 forks source link

Ubuntu 18.04 failure #42

Closed drossetti closed 6 years ago

drossetti commented 6 years ago

building the latest repo version on Ubuntu 18.04 fails in a subtle way:

<...>/nv_peer_memory/create_nv.symvers.sh 4.15.0-20-generic
-W- Could not get list of nvidia symbols.
Found /usr/src/nvidia-410.09//nvidia/nv-p2p.h
/bin/cp -f /usr/src/nvidia-410.09//nvidia/nv-p2p.h /home/lab/IB/nv_peer_memory/nv-p2p.h
cp -rf /usr/src/ofa_kernel/4.15.0-20-generic/Module.symvers .
cat nv.symvers >> Module.symvers
make -C /lib/modules/4.15.0-20-generic/build  M=/home/lab/IB/nv_peer_memory modules
make[1]: Entering directory '/usr/src/linux-headers-4.15.0-20-generic'
  CC [M]  /home/lab/IB/nv_peer_memory/nv_peer_mem.o
/home/lab/IB/nv_peer_memory/nv_peer_mem.c:80:9: note: #pragma message: Enable nvidia_p2p_dma_map_pages support
 #pragma message("Enable nvidia_p2p_dma_map_pages support")
         ^~~~~~~
  Building modules, stage 2.
  MODPOST 1 modules
WARNING: "nvidia_p2p_dma_map_pages" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
WARNING: "nvidia_p2p_dma_unmap_pages" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
WARNING: "nvidia_p2p_free_page_table" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
WARNING: "nvidia_p2p_free_dma_mapping" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
WARNING: "nvidia_p2p_get_pages" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
WARNING: "nvidia_p2p_put_pages" [/home/lab/IB/nv_peer_memory/nv_peer_mem.ko] undefined!
  LD [M]  /home/lab/IB/nv_peer_memory/nv_peer_mem.ko
make[1]: Leaving directory '/usr/src/linux-headers-4.15.0-20-generic'

it seems to be related to the kernel not being built with modversions enabled, e.g. in /boot/config-4.15.0-20-generic:

# CONFIG_MODVERSIONS is not set
CONFIG_MODULE_SRCVERSION_ALL=y

It is not clear whether this fatal or not, though we are observing run-time errors when trying to send GPU memory.

alaahl commented 6 years ago

"-W- Could not get list of nvidia symbols." is expected when CONFIG_MODVERSIONS is not set. the kernel does not generate symbol versions and does not check them on module load.