enfiskutensykkel / ssd-gpu-dma

Build userspace NVMe drivers and storage applications with CUDA support
BSD 2-Clause "Simplified" License
342 stars 47 forks source link

Cmake output saying 'Configuring kernel module without CUDA' #28

Open babouFomb opened 4 years ago

babouFomb commented 4 years ago

Hi;

I have a Jetson Xavier AGX kit board and I plugged into the M.2 key M an NVMe SSD. Now, I'm trying to install your libnm on my Xavier and I show the following message in CMake output:

-- Found CUDA: /usr/local/cuda-10.0 (found suitable version "10.0", minimum required is "8.0") -- Using NVIDIA driver found in -- Configuring kernel module without CUDA -- Configuring done -- Generating done -- Build files have been written to: /home/ganapathi/Downloads/ssd-gpu-dma-master/build

How can I force Cmake to build with CUDA?

Thank

enfiskutensykkel commented 4 years ago

Hi,

It's strange that it is that it is outputting that it found the Nvidia driver, but the string appears empty. Regardless, it is possible to override the location using -DNVIDIA=<path to driver source>, usually it can be found in /usr/src/nvidia-<driver version>. Make sure to run make in the Nvidia driver directory first.

babouFomb commented 4 years ago

But I can't found the path to driver source. I my /usr/src/ directory there is no nvidia-driver--.

ganapathi@ganapathi-desktop:/$ ls /usr/src/ cudnn_samples_v7 linux-headers-4.9.140-tegra-linux_x86_64 linux-headers-4.9.140-tegra-ubuntu18.04_aarch64 nvidia tensorrt ganapathi@ganapathi-desktop:/$ ls /usr/src/linux-headers-4.9.140-tegra-ubuntu18.04_aarch64/ kernel-4.9 nvgpu nvidia ganapathi@ganapathi-desktop:/$ ls /usr/src/linux-headers-4.9.140-tegra-ubuntu18.04_aarch64/kernel-4.9/ arch block certs crypto drivers firmware fs include init ipc Kbuild Kconfig kernel lib Makefile mm Module.symvers net samples scripts security sound tools usr virt

I try to find it with the find iname nvidia-driver but I got juste

ganapathi@ganapathi-desktop:/$ sudo find -iname nvidia-driver find: ‘./run/user/1000/gvfs’: Permission denied find: ‘./run/user/120/gvfs’: Permission denied

enfiskutensykkel commented 4 years ago

I'm not really familiar with the Xavier, but you can try locating the directory manually by running find /usr/src -name "nv-p2p.h". However, it's possible that the Nvidia driver symbols are built into the kernel source.

enfiskutensykkel commented 4 years ago

You can attempt using the either -DNVIDIA=/usr/src/linux-headers-4.9.140-tegra-ubuntu18.04_aarch64/nvgpu/include/linux/ or -DNVIDIA=/usr/src/linux-headers-4.9.140-tegra-linux_x86_64/nvgpu/include/linux/

Hope it works!

babouFomb commented 4 years ago

Hi, I used the -DNVIDIA=/usr/src/linux-headers-4.9.140-tegra-ubuntu18.04_aarch64/nvgpu/include/linux/ flag. Now, CMake find the path to the driver, but it still configuring kernel module without CUDA

-- Found CUDA: /usr/local/cuda-10.0 (found suitable version "10.0", minimum required is "8.0") -- Using NVIDIA driver found in /usr/src/linux-headers-4.9.140-tegra-ubuntu18.04_aarch64/nvgpu/include -- Configuring kernel module without CUDA -- Configuring done -- Generating done -- Build files have been written to: /home/ganapathi/Downloads/ssd-gpu-dma-master/build

enfiskutensykkel commented 4 years ago

Hi,

Yes, it's looking for the Module.symvers but on the Tegra the Nvidia driver seems to be compiled in to the kernel.

You can either try modifying this line and remove the check for Module.symvers https://github.com/enfiskutensykkel/ssd-gpu-dma/blob/master/CMakeLists.txt#L144 if (CUDA_FOUND AND NOT no_cuda)

If that doesn't work, you could try modifying this line and add -D_CUDA: https://github.com/enfiskutensykkel/ssd-gpu-dma/blob/master/CMakeLists.txt#L149 set (module_ccflags "-D_CUDA -I${libnvm_root}")

The third option is perhaps modifying the generated Makefile for the kernel module after running CMake.

enfiskutensykkel commented 4 years ago

It may actually be easier to just add -D_CUDA to this line in Makefile.in: https://github.com/enfiskutensykkel/ssd-gpu-dma/blob/master/module/Makefile.in#L6

You may need to include the header file location, so the line becomes: ccflags-y += @module_ccflags@ -D_CUDA -I/usr/src/linux-headers-4.9.140-tegra-ubuntu18.04_aarch64/nvgpu/include/linux

babouFomb commented 4 years ago

I think the problem comes from finding Nvidia driver symbols (lines 66 - 68). I tried to locate the Module.symvers with find /usr/src -name "Module.symvers" and I obtain this output

/usr/src/linux-headers-4.9.140-tegra-linux_x86_64/kernel-4.9/Module.symvers /usr/src/linux-headers-4.9.140-tegra-ubuntu18.04_aarch64/kernel-4.9/Module.symvers

So I checked in CMake, and remove the fird condition in the line 144 if (CUDA_FOUND AND NOT no_cuda AND EXISTS "${driver_dir}/Module.symvers") and then replace the line 146 with set (module_symbols "/usr/src/linux-headers-4.9.140-tegra-ubuntu18.04_aarch64/kernel-4.9/Module.symvers"). I also replaced the line 6 in Makefile.in by ccflags-y += @module_ccflags@ -D_CUDA -I/usr/src/linux-headers-4.9.140-tegra-ubuntu18.04_aarch64/nvgpu/include/linux. Now, CMake output is

-- Found CUDA: /usr/local/cuda-10.0 (found suitable version "10.0", minimum required is "8.0") -- Using NVIDIA driver found in /usr/src/linux-headers-4.9.140-tegra-ubuntu18.04_aarch64/nvgpu/include -- Configuring kernel module with CUDA -- Configuring done -- Generating done

The commands make libvnm and make examples run successfully. Since I will not use SISCI SmartIO, I go in module directory (inside my build) dir and run make. I obtained several compilation errors (seen in the attached files) log.txt

enfiskutensykkel commented 4 years ago

It appears that the symbols are wildly different on the Xavier/Tegra kernel than for x86. https://docs.nvidia.com/cuda/gpudirect-rdma/index.html#kernel-api I need to get my hands on a Xavier in order to make those fixes.

Regardless, my understanding is that the main system memory is shared by the on-board GPU on all of the Tegra SoCs, so it might be the case that those calls aren't really necessary to begin with. I need to do some more investigations into that.

babouFomb commented 4 years ago

I make some modification in the map.c file in order to port it to Xavier:

Adding struct device* dev; in struct map in map.h Removing the arguments map->pdev, gd->pages from nvidia_p2p_dma_unmap_pages (line 257 in map.c) Removing the arguments 0, 0, map->vaddr from nvidia_p2p_put_pages (line 262 in map.c) Adding enum dma_data_direction direction to map_gpu_memory formal arguments (line 276 in map.c) Removing the arguments map->vaddr, GPU_PAGE_SIZE * map->n_addrs from nvidia_p2p_get_pages (line 296 in map.c) Replacing the 1st argument (struct pci_dev) of nvidia_p2p_dma_map_pages by dev of type struct device* and adding a new argument of type enum dma_data_direction direction (line 303 in map.c) In line 346 in map.c, I added 0 since I previously added the supplementary enum dma_data_direction direction argument to map_gpu_memory. In the file linux/dma-mapping.h, the value 0 corresponds to DMA_BIDIRECTIONAL (I am not sure choosing 0 is a good idea)

After these modifications, the module has been successfully compiled without any error. However, when I tried the identify example, I get the following output :

ganapathi@ganapathi-desktop:~/Downloads/ssd-gpu-dma-master/build/bin$ lspci 0000:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1) 0000:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981 0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1) 0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13)

ganapathi@ganapathi-desktop:~/Downloads/ssd-gpu-dma-master/build$ sudo ./bin/nvm-identify-userspace --ctrl=01:00.0 [sudo] password for ganapathi: Resetting controller and setting up admin queues... Failed to get number of queues Goodbye!

enfiskutensykkel commented 4 years ago

After these modifications, the module has been successfully compiled without any error.

Great!

However, when I tried the identify example, I get the following output :

If you have loaded the module, then you should invoke the ./bin/nvm-identify example, not the userspace variant. The ctrl argument should be the character device in /dev (I believe it's likely to be /dev/libnvm0, you can also check the output in system log by running dmesg).

Apologies for the documentation, it isn't really up to date.

That being said, it seems to me that the DMA is not working properly. I'm not sure if it is possible to disable the IOMMU on Xavier, but it might be the case that it is always on. I don't think the module sets up DMA mappings properly when the IOMMU is enabled. You can look for DMA faults in the system log (dmesg).

Additionally, I also believe that some of the Tegras are not cache coherent. If Xavier isn't as well, then you might need to add code to flush the cache in the queue functions.

babouFomb commented 4 years ago

I have loaded the module sudo insmod libnvm.ko and then verify that is has been loaded properly :

ganapathi@ganapathi-desktop:~/Downloads/ssd-gpu-dma-master/build/module$ lsmod Module Size Used by libnvm 12856 0

I run the identify example with sudo ./nvm-identify --ctrl=/dev/nvme0n1, but I got nothing as output. :

ganapathi@ganapathi-desktop:~/Downloads/ssd-gpu-dma-master/build/bin$ sudo ./nvm-identify --ctrl=/dev/nvme0n1 ganapathi@ganapathi-desktop:~/Downloads/ssd-gpu-dma-master/build/bin$ sudo ./nvm-identify --ctrl=/dev/nvme0n1

I also verified if the IOMMU is enabled with cat /proc/cmdline | grep iommu and the output is empty so I suppose the IOMMU is disabled.

I looked for DMA faults with dmesg I seen this :

ganapathi@ganapathi-desktop:~/Downloads/ssd-gpu-dma-master/build/bin$ dmesg | grep dma [ 1.023726] iommu: Adding device 2600000.dma to group 56 [ 1.302411] tegra-carveouts tegra-carveouts: vpr :dma coherent mem declare 0x00000000f0000000,134217728 [ 1.304883] tegra-gpcdma 2600000.dma: GPC DMA driver register 31 channels [ 1.965973] tegradccommon 15200000.dc_common: dma mapping done [ 1.978965] tegra-adma 2930000.adma: Tegra210 ADMA driver registered 16 channels [ 5.752450] misc nvmap: cvsram :dma coherent mem declare 0x0000000050000000,4194304

enfiskutensykkel commented 4 years ago

--ctrl=/dev/nvme0n1

This is not the character device created by the nvme module, this is the block device created by the built-in Linux NVMe driver. You need to unbind the driver for the NVMe and then reload the libnvm driver.

babouFomb commented 4 years ago

Sorry, I don't nkow how to unbind the driver for the NVMe and then reload the libnvm driver

enfiskutensykkel commented 4 years ago

No problem,

$ echo -n "0000:01:00.0" > /sys/bus/pci/devices/0000\:01\:00.0/driver/unbind

Reloading libnvm can be done by going into the directory where you built the libnvm module and running make reload. Alternatively you can run rmmod libnvm and insmod libnvm.ko num_ctrls=64 again.

babouFomb commented 4 years ago

Thank

I run the echo command with sudo but I obtain a permission denied.

ganapathi@ganapathi-desktop:~$ sudo echo -n "0000:01:00.0" > /sys/bus/pci/devices/0000\:01\:00.0/driver/unbind bash: /sys/bus/pci/devices/0000:01:00.0/driver/unbind: Permission denied

When I pllugged my SSD into the Xavier M.2 Key M slot, I formated in Ext4, mounted it a repertory (located in root), and added this UUID=268248a5-8602-4c83-8594-454d9d0ed011 /xavier_ssd ext4 defaults 0 2 in my fstab file in order to make the SSD automatically mounted at each boot of the Xavier. I don't nkow if this can explain why I got the permission denied message when trying to unbind the driver for the NVMe.

enfiskutensykkel commented 4 years ago

Yeah, it's most likely mounted. Try unmounting it (using umount). Be aware that if you have any data on the disk, the exsample programs might destroy that. So be sure to back what you have up in beforehand.

babouFomb commented 4 years ago

My apologies for the late response, I was not working yesterday. So I have unmounted the SSD and commented the line in fstab, and rebooted the Xavier. Next, i tried to unbind the driver for NVMe with

sudo echo -n "0000:01:00.0" > /sys/bus/pci/devices/0000\:01\:00.0/driver/unbind

but I still obtain a permission denied mesage :

bash: /sys/bus/pci/devices/0000:01:00.0/driver/unbind: Permission denied

enfiskutensykkel commented 4 years ago

I suspect that the problem is that sudo is not evaluating the pipe operator, so that only echo is ran with elevated privileges. You could start a shell (using sudo -s or -i), or run sudo sh -c 'echo -n "0000:01:00.0" > /sys/bus/pci/devices/0000\:01\:00.0/driver/unbind'

After unbinding the NVMe driver and loading the libnvm module, please confirm running lspci -s 01:00.0 -vv. The last lines should indicate which driver is currently using the device.

babouFomb commented 4 years ago

With sudo sh -c 'echo -n "0000:01:00.0" > /sys/bus/pci/devices/0000\:01\:00.0/driver/unbind' the NVMe driver has been unbind. After loading the libnvm module, the output of lspci -s 01:00.0 -vv confirms that the libnvm helper driver is used :

0000:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981 (prog-if 02 [NVM Express]) Subsystem: Samsung Electronics Co Ltd Device a801 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 32 Region 0: Memory at 1b40000000 (64-bit, non-prefetchable) [size=16K] Capabilities: Kernel driver in use: libnvm helper

However, running the ./bin/nvm-identify --ctrl=/dev/libnvm0 example causes the rebooting of the system. I tried two times to run this example amd I got the same thimg : system reboots after 2-3minutes and and neither the mouse nor the keyboard can be used during this time.

enfiskutensykkel commented 4 years ago

It sounds to me like the system is crashing. Is it possible to run dmesg -w in a second terminal before running the example? Hopefully, you should get some sort of info from the system log before everything freezes up.

babouFomb commented 4 years ago

I run dmesg -w in a second terminal before running the example and the log file is attached bellow. I see some message about nvgpu. However, the example was able to get the information from the SSD before the system crashes.

capture dmesg-before-running.txt

Now, I want tu run the CUDA example, so I run make in the /ssd-gpu-dma-master/build/benchmarks/cuda directory but I don't any executable and make does not show any output.

enfiskutensykkel commented 4 years ago

You need to run make benchmarks in the build folder, and the programs should show up as nvm-latency-bench and nvm-cuda-bench. Both use CUDA, the former can be used to allocate memory on the GPU but the CPU is in control of the NVMe. The latter launches a CUDA kernel, where the GPU is in control of the NVMe. For the sake of debugging, I would recommend sticking to nvm-latency-bench for now.

Look at the output from running with the --help argument, IIRC the --gpu=0 argument should indicate using the GPU. I will have a look at the dmesg log once I arrive at the office.

enfiskutensykkel commented 4 years ago
[ 1394.317399] libnvm: loading out-of-tree module taints kernel.
[ 1394.319203] Adding controller device: 01:00.0
[ 1394.319994] Character device /dev/libnvm0 created (505.0)
[ 1394.320197] libnvm helper loaded
[ 1615.073433] nvgpu: 17000000.gv11b     gk20a_channel_timeout_handler:1556 [ERR]  Job on channel 507 timed out

It seems that this happens quite a while after loading the module, but I am unsure what causes it. I initially suspected that there was an issue with memory corruption, but it doesn't seem to be the case since it completed with what appears to be correct data in your screenshot above. I have access to a Xavier at work, but I doubt I will have time to look into this until over new years.

Any input or experience you have testing this out is very valuable to me, so I appreciate it.

babouFomb commented 4 years ago

I tried to reproduce the output seen on the screenshot seen above but the system continue to crash. I saved the output of cat /var/log/syslog | grep nvgpu and cat /var/log/syslog | grep libnvm of different system crashes in the attached file syslog.txt. I also tried to run the nvm-latency-bench with arguments --ctrl=/dev/libnvm0 --gpu=0 --blocks=256 (and --blocks=1000) but Xavier crash.

I am not familiar with kernel log and driver. So I will read some documentations as well as nvme specifications and try to understand why there is this problem.

syslog.txt

enfiskutensykkel commented 4 years ago

Out of interest, could you try unloading the libnvm module/rebooting and run the nvm-identify-userspace version of the identify program with --ctrl=01:00.0 as argument? This version uses sysfs in order to interface with the SSD, and should rule out any programming errors I have done in the kernel module.

babouFomb commented 4 years ago

I rebooted the system, reloaded the libnvm and launched the nvm-identify-userspace example with --ctrl=01:00.0 as argument, and unfortemately the system crashed again afer 3 minutes and bufore it crash the message Resetting the controller .... appeared and disappeared. The output of cat /var/log/syslog | grep nvgpu is the same as previously

enfiskutensykkel commented 4 years ago

Sorry, I meant not loading the libnvm module at all. Just unbinding the kernel NVMe driver, so that lspci -s 01:00.0 -vv shows no driver using it, and then run the nvm-identify-userspace program.

babouFomb commented 4 years ago

Hi, I rebooted the system, unbind the kernel NVMe driver. lspci -s 01:00.0 -vv show me that there is no driver in used. And then I run the nvm-identify-userspace --ctrl=01:00.0. I don't seen anything as result. As in previous test, the system crashed, neither the keyboard nor the mouse worked. After 30 minutes I disconnected the power cable and then reconnected to restart the system.

20191220_102924

enfiskutensykkel commented 4 years ago

Thank you for testing this for me. I just realized that you have a SATA controller on the other PCI domain, so one last thing to test, just in case is the BDF parsing is wrong is to do the same as above but also nvm-identify-userspace --ctrl:0000:01:00.0 and running dmesg -w in a second terminal.

You don't need to wait 30 minutes, I think if it freezes and does not immediately return, it's safe to assume that it has stalled.

I'm really not sure what is going wrong. It seems really strange that it stalls the system like this, I will have to take a look at it over new years. We could try adding print statements between the individual steps in the identify userspace example (examples/identify/userspace.c and examples/identify/common.c).

Again, thank you so much for testing it.

babouFomb commented 4 years ago

I did the same as above with nvm-identify-userspace --ctrl=0000:01:00.0 and run dmesg -w in a second terminal in parallel. As previously, the system crashes after few seconds printed the Resetting controller and setting ........ message. So, I thing, the problem is the next step after the resetting controller. I will add print statements and try to locate at where statement it stalls.

dmesg-log-before-running.txt second-test-dmesg.txt

babouFomb commented 4 years ago

I added some print statements in examples/identify/common.c and examples/identify/userspace.c and opened two terminal side by side in order to see the output of dmesg -w in the terminal instead of saving it in a file (as I previously did).

In the last line of the attached file dmesg.jpg, there is a context fault message due to IOMMU (I think), this message appeared and disappeared in very few seconds. May be, the problem comes from the IOMMU (it seems to be enabled). However, the command cat /proc/cmdline | grep iommu does print anything. So I run dmesg | grep iommu and the output is attached to this message.

The thirs file identify.txt shows the output of the nvm-identify-userspace example output with somes printed messages.

dmesg identify iommu.txt

enfiskutensykkel commented 4 years ago

So if the IOMMU is enabled, that explains why it hangs in the identify_ctrl function as it's waiting for DMA that never completes (due to IOMMU fault). I also saw in the attached logs that the IOMMU is enabled.

However, previously the identify operation did succeed (when you used the kernel module), but I also see in those logs that the IOMMU most likely was on (which is strange).

babouFomb commented 4 years ago

I will try to disable the SMMU for the PCIe controller-0 (on witch my SSD is connected according to the screenshot where seeing the IOMMU context fault error message). Maybe that will solve the problem !

babouFomb commented 4 years ago

Hi,

To disable SMMU for PCIe controller 0, I modified the device tree and used instructions in comment #4 of https://devtalk.nvidia.com/default/topic/1043746/jetson-agx-xavier/pcie-smmu-issue/ and then reflash my Xavier board with the new device tree binary.

I verified that SMMU is disabled by extracting the current device tree on my Xavier and I found that there no entry for the SMMU.

Next, I tried to run the nvm-identify-userspace but the board continue to being freeze as in previous tests and I seen I the terminal that the example hangs in the identify_ctrl function. I take a screenshot of dmesg -w output and It seems that the SMMU is not completely disable. Now, the context fault occur in smmu1 instead of smmu0 (as it was the case in last week tests). Now, I have another errors from the memory-controller. xavier_nvme

babouFomb commented 4 years ago

Hi,

Did you have time to look what is going wrong on the Xavier platform? I posted the problem on Nvidia forum ( https://devtalk.nvidia.com/default/topic/1069024/jetson-agx-xavier/pcie-smmu-issues-on-pcie-c0-with-an-nvme-ssd-connected-to-the-m-2-key-m-slot/ ). It seems that Xavier AGX does not support the PCIe P2P protocol and this can explain the behavior showed.

enfiskutensykkel commented 4 years ago

Hi, sorry for the late reply.

Yes, I have discussed this with one of my colleagues that is more familiar with Tegras/Xavier than me, and yes, I don't think it is possible to disable the SMMU/IOMMU, which is going to disrupt peer-to-peer DMA. Some time in the future, I will look into using the IOMMU API for the kernel module (SmartIO/SISCI already supports this, which is why it is not prioritized), but don't expect this to be soon.

If you can live with the limitations of not using peer-to-peer, @cooldavid implemented VFIO support for the identify controller example: https://github.com/enfiskutensykkel/ssd-gpu-dma/pull/23

babouFomb commented 4 years ago

Hi,

I tested the implementation mentioned in #23 but my Xavier still continues to freeze.

I can live with the limitations of not using peer-to-peer. In fact, I want to be able to perform DMA transfers between the SSD and Xavier system memory (since access to the GPU memory is seemingly not possible on the Xavier).

If GPU memory is not used, there is no need for the module (as it is the one that contains Nvidia files using peer-to-peer). Would it be possible to modify the sources of the ssd-gpu-nvm project to perform DMA transfers between the CPU and the SSD without using peer-to-peer not all?

I'm not familiar with driver writing. But I have read some documents on NVMe specifications. If kip the default Nvidia nvm driver loaded, would it be possible to do something similar as in the Cuda example, but without using GPU memory. I mean interact with the NVMe controller and perform DMA transfers through the different stages that we see in the figure below?

image

enfiskutensykkel commented 4 years ago

It should be possible, but this is exactly what the identify example does but the system freezes. It may be your modifications to the kernel module, you can try recompiling it without CUDA support and the the modified calls to the various nvidia functions.

The problem is the IOMMU/SMMU: If it isn't possible to disable, then there must be some code that sets up the correct mappings so that the I/O addresses are translated into the correct physical addresses.

However, if the VFIO example also caused the system to freeze, there must be something else at fault. In this case, my kernel module should not be in use at all and the Linux kernel should be able to set up the correct IOMMU groups. I don't know what is wrong in this case.