NVIDIA / gdrcopy

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
MIT License
896 stars 144 forks source link

Install on jetson agx #198

Open daynauth opened 3 years ago

daynauth commented 3 years ago

Is it possible to install this on the Jetson AGX?

I tried installing the .deb package but keep getting the error

sudo dpkg -i gdrdrv-dkms_2.2-1_arm64.deb
Selecting previously unselected package gdrdrv-dkms:arm64.
(Reading database ... 202120 files and directories currently installed.)
Preparing to unpack gdrdrv-dkms_2.2-1_arm64.deb ...
Unpacking gdrdrv-dkms:arm64 (2.2-1) ...
Setting up gdrdrv-dkms:arm64 (2.2-1) ...
Loading new gdrdrv-2.2 DKMS files...
It is likely that 4.9.140-tegra belongs to a chroot's host
Building for 4.15.0-143-generic and 4.9.140-tegra
Building for architecture arm64
Building initial module for 4.15.0-143-generic
ERROR (dkms apport): unable to determine source package for gdrdrv-dkms
Error! Bad return status for module build on kernel: 4.15.0-143-generic (arm64)
Consult /var/lib/dkms/gdrdrv/2.2/build/make.log for more information.
Job for gdrdrv.service failed because the control process exited with error code.
See "systemctl status gdrdrv.service" and "journalctl -xe" for details.
invoke-rc.d: initscript gdrdrv, action "start" failed.
● gdrdrv.service - LSB: Startup/shutdown script for GDRcopy kernel-mode driver
   Loaded: loaded (/etc/init.d/gdrdrv; generated)
   Active: failed (Result: exit-code) since Sun 2021-05-16 14:04:22 EDT; 17ms ago
     Docs: man:systemd-sysv-generator(8)
  Process: 13022 ExecStart=/etc/init.d/gdrdrv start (code=exited, status=1/FAILURE)

May 16 14:04:22 jetson-agx systemd[1]: Starting LSB: Startup/shutdown script for GDRcopy kernel-mode driver...
May 16 14:04:22 jetson-agx gdrdrv[13022]: Checking required modules: module nvidia is not loaded
May 16 14:04:22 jetson-agx gdrdrv[13022]:  *
May 16 14:04:22 jetson-agx gdrdrv[13022]: modinfo: ERROR: Module gdrdrv not found.
May 16 14:04:22 jetson-agx gdrdrv[13022]: Module gdrdrv does not exist
May 16 14:04:22 jetson-agx systemd[1]: gdrdrv.service: Control process exited, code=exited status=1
May 16 14:04:22 jetson-agx systemd[1]: gdrdrv.service: Failed with result 'exit-code'.
May 16 14:04:22 jetson-agx systemd[1]: Failed to start LSB: Startup/shutdown script for GDRcopy kernel-mode driver.
dpkg: error processing package gdrdrv-dkms:arm64 (--install):
 installed gdrdrv-dkms:arm64 package post-installation script subprocess returned error exit status 1
Processing triggers for systemd (237-3ubuntu10.47) ...
Errors were encountered while processing:
 gdrdrv-dkms:arm64

Also, I'm using the release version, not the git clone (the git clone gives the same error)

pakmarkthub commented 3 years ago

Hi @daynauth,

GDRCopy supports Quadro- and Tesla-class GPUs only. In other words, it does not work with Jetson AGX today.

drossetti commented 3 years ago

ERROR (dkms apport): unable to determine source package for gdrdrv-dkms this particular error could be related to something else.

But eventually, after solving this problem, you would bang against the fact that GPUDirect RDMA APIs for iGPU in L4T are slightly different than those available on discrete GPUs.

Besides, GPUDirect RDMA support on iGPU only works on memory allocated via cudaHostAlloc.