HSAFoundation / HSA-Drivers-Linux-AMD

These drivers have been superseded by ROCm Platform now hosted at Radeon Open Compute GitHub Repo
https://github.com/RadeonOpenCompute
Other
61 stars 15 forks source link

HSA and High speed RDMA supported network devices (eg.: Mellanox ConnectX-2 VPI PCIe 2.0 5GT/s - IB QDR / 10GigE (MT26428)) #6

Closed zpodlovics closed 9 years ago

zpodlovics commented 10 years ago

Probably a bit unusual request - but I would like to use the a RDMA supported card (MT26428) with HSA - to provide accelerated packet and message processing. However the interop is not yet clear between the IOMMU and RDMA device. The KFD require a complete physical memory view and control - otherwise it would be impossible to pass pointers between the gpu and cpu. However parts of the memory could be also read/written by RDMA enabled devices. These devices usually provide some kind of user level pinned memory buffer for communication, and the operating system is a mediator between the program and the RDMA capable device. The OS is responsible for all the API calls that set up the memory regions that will able to be RDMA read or RDMA written. These pinned memory regions will be also registered with the RDMA capable card. A memory region could be accessible by the cpu and gpu and rdma capable card - so the KFD require to know about these RDMA regions.

https://stackoverflow.com/questions/18755999/remote-direct-memory-access-and-os

In theory this method could be also used for for APU <-> APU communication to distribute work between multiple APU.

gstoner commented 10 years ago

It not strange request. Let me talk with AMD Research who working with National Labs here in the states. RDMA support is on our roadmap of a feature we need to support. Let me get back to on monday.

greg On Oct 5, 2014, at 11:19 AM, Zoltan Podlovics notifications@github.com<mailto:notifications@github.com> wrote:

Probably a bit unusual request - but I would like to use the a RDMA supported card (MT26428) with HSA - to provide accelerated packet and message processing. However the interop is not yet clear between the IOMMU and RDMA device. The KFD require a complete physical memory view and control - otherwise it would be impossible to pass pointers between the gpu and cpu. However parts of the memory could be also read/written by RDMA enabled devices. These devices usually provide some kind of user level pinned memory buffer for communication, and the operating system is a mediator between the program and the RDMA capable device. The OS is responsible for all the API calls that set up the memory regions that will able to be RDMA read or RDMA written. These pinned memory regions will be also registered with the RDMA capable card. A memory region could be accessible by the cpu and gpu and rdma capable card - so the KFD require to know about these RDMA regions.

https://stackoverflow.com/questions/18755999/remote-direct-memory-access-and-os

In theory this method could be also used for for APU <-> APU communication to distribute work between multiple APU.

— Reply to this email directly or view it on GitHubhttps://github.com/HSAFoundation/HSA-Drivers-Linux-AMD/issues/6.

zpodlovics commented 10 years ago

I have added the MT26428 card to my computer and I tried to boot - using the default kernel driver, most of the time it just hang, but one time I am able to catch an another error screenshot (attached).

kgd2kfd_device_init_hang_1024x768

After installing the MLNX_OFED_LINUX-2.3-1.0.1-ubuntu14.04-x86_64 distribution I have managed to boot one (and only one) time, the dmesg are attached. While it's running the machine constantly generate the following message to the kernel log:

AMD-Vi: Completion-Wait loop timed out

https://gist.github.com/zpodlovics/b7e43ad0a39b509b6aa1

After this I am unable to boot the machine correctly, it will just hang on boot.

ogabbay commented 10 years ago

Hi, I have a question. What happens if you load the kernel that came with the OFED distribution (without kfd) and the IOMMU is enabled in the BIOS ? Do you still get that error messages during boot ?

zpodlovics commented 10 years ago

I have tried it with additional modprobe.blacklist=radeon_kfd boot parameter, the results are the following from the boots (both have module.blacklist=radeon_kfd boot parameter). I will try it later with netconsole, maybe it could give more detailed logs for the failed boots.

3 failed and hang at

[drm] Initialized radeon 2.37.0 20080828 for 0000:00:01.0 on minor 0

1 failed and hang at

mlx4_core 0000:01:00.0 DMFS high rate mode not supported

1 partially successful boot, but with lot's of IO_PAGE fault messages like this, full dmesg are available at: https://gist.github.com/zpodlovics/4254c34609416c16b257

 AMD-Vi: Event logged [IO_PAGE_FAULT device=00:07.2 domain=0x0000 address=0x0000000000000040 flags=0x0050]

2 successful boot: https://gist.github.com/zpodlovics/355cffbf870941197417 https://gist.github.com/zpodlovics/b94a7c9177b2f158425d

ogabbay commented 10 years ago

My current thinking is that this has nothing to do with kfd driver proper, but with the relationship between an enabled IOMMU and Mellanox's adapter. I assume that if you disable IOMMU in the BIOS, you won't have any issues booting, correct ? I think that maybe by operating the IOMMU in pass-through mode, the IOMMU won't intercept the transactions made by the Mellanox adapter. You can enable this mode by putting "iommu=pt" at the kernel command line in the grub menu.

zpodlovics commented 10 years ago

I have tried a few boot options:

1) Disabled IOMMU Will hang on

kfd kfd: error getting iommu info. is the iommu enabled?
[drm] Initialized radeon 2.37.0 20080828 for 0000:00:01.0 on minor 0

2) Disabled IOMMU and modprobe.blacklist=radeon_kfd

boots correctly

3) Enabled IOMMU and iommu=pt Will hang on

radeon: 0000:00:01.0 registered panic notifier

4) Enabled IOMMU and recovery nomodeset iommu=pt

Usually the recovery console are working: https://gist.github.com/zpodlovics/69e48c1baa5c05d8a729

But from 3 try it show 2 times a blank screen on resuming normal startup and hang/crash, and 1 times a half initialized desktop gui (hang before the full refresh), no reaction to inputs.

Probably there are some state dependency in the system - because after the successful boot in 4) I also tried to boot with Enabled IOMMU and iommu=pt, and it showed a low resolution blank screen with a huge mouse cursor at the center. The mouse cursor size, shape, colors are looks like the mouse cursor in the BIOS, but I am not 100% sure about this (I'll compare the mouse cursors later). After this hang, I cannot successfully boot the system with this Enabled IOMMU and iommu=pt settings another time, even when I tried the 4) -> 3) sequence again.

gstoner commented 10 years ago

I talked to our Research team who is using Infiniband in the Kaveri systems, We have HSA-enabled Kaveri systems with Mellanox ConnectIB HCAs that are working in the lab. They are know to work with have run with both Pre-Alpha and Alpha and upcoming Beta HSA releases. The driver we are using is the InfiniBand drivers, we are using MLNX_OFED_LINUX-2.3-1.0.1-ubuntu14.04-x86_64 as well. They did not have to do anything special after the Mellanox installation to get things to work. They are using generic kernel level is 3.14.0-031499-generic.

One delta they are using the ConnectIB cards where your using the ConnectX-2 HCAs. We know that the two cards use a different kernel driver modules. What I suspect is there is but in Mellonox kernel driver module. I had seen other issue with this card and Intel's IOMMU.

I should make it clear HSA runtime and driver have to have IOMMU it is not optional.

Greg

On Oct 7, 2014, at 5:57 AM, Zoltan Podlovics notifications@github.com<mailto:notifications@github.com> wrote:

I have tried a few boot options:

1) Disabled IOMMU Will hang on

kfd kfd: error getting iommu info. is the iommu enabled? [drm] Initialized radeon 2.37.0 20080828 for 0000:00:01.0 on minor 0

2) Disabled IOMMU and modprobe.blacklist=radeon_kfd

boots correctly

3) Enabled IOMMU and iommu=pt Will hang on

radeon: 0000:00:01.0 registered panic notifier

4) Enabled IOMMU and recovery nomodeset iommu=pt

Usually the recovery console are working: https://gist.github.com/zpodlovics/69e48c1baa5c05d8a729

But from 3 try it show 2 times a blank screen on resuming normal startup and hang/crash, and 1 times a half initialized desktop gui (hang before the full refresh), no reaction to inputs.

Probably there are some state dependency in the system - because after the successful boot in 4) I also tried to boot with Enabled IOMMU and iommu=pt, and it showed a low resolution blank screen with a huge mouse cursor at the center. The mouse cursor size, shape, colors are looks like the mouse cursor in the BIOS, but I am not 100% sure about this (I'll compare the mouse cursors later). After this hang, I cannot successfully boot the system with this Enabled IOMMU and iommu=pt settings another time, even when I tried the 4) -> 3) sequence again.

— Reply to this email directly or view it on GitHubhttps://github.com/HSAFoundation/HSA-Drivers-Linux-AMD/issues/6#issuecomment-58167621.

pblinzer commented 10 years ago

To follow up here (there was a similar issue in some other forum thread):

An IOMMU among other things tracks and blocks accesses to physical memory that are not authorized/expected by either the OS or a Hypervisor to be accessed. When available, Linux uses the IOMMU for a feature called DMA redirection (which is another phrase for DMA isolation) that uses the IOMMU to block DMA accesses to arbitrary memory by device drivers (causing the messages above). This is to improve robustness so that a driver can't accidentally or intentionally manipulate physical memory it shouldn't.

The drivers (typically network or storage drivers doing RDMA) must use the DMA API to map the memory they'd like to access to vet the memory access with the OS before doing so, in order to avoid the messages above. I suspect that the network drivers causing the problems here do not follow these requirements. This is not an issue of the IOMMU, the KFD or HSA, it just uncovers "bad" system behavior by some other drivers.

Here's the documentation on the kernel interface that needs to be used to clear the memory access: https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt

zpodlovics commented 10 years ago

Please note SR-IOV also require IOMMU support, and it seems the same ConnectX-2 chips/cards (MT26428 in the first publication and MHRH19B-XTR on the second) are working fine with other IOMMU implementations. In fact, almost a year ago I already tested the same card type with another IOMMU implementation on Scientific Linux and worked fine. Maybe a driver issue, maybe a bios issue, I don't know. But still, I need a working and stable system.

Adit Ranadive, Ada Gavrilovska, Karsten Schwan: Distributed Resource Exchange: Virtualized Resource Management for SR-IOV InfiniBand Clusters http://www.cc.gatech.edu/~adit262/docs/DRX-Cluster2013.pdf

Tiago Pais Pitta de Lacerda Ruivo, Gerard Bernabeu Altayo, Gabriele Garzoglio, Steven Timm, Hyun Woo Kim, Seo-Young Noh, Ioan Raicu: Efficient High-Performance Computing with Infiniband Hardware Virtualization http://datasys.cs.iit.edu/reports/2014_IIT_virtualization-fermicloud.pdf

gstoner commented 10 years ago

We understand you need a working system. Do you have another Infiniband card you can test with to see we can isolate where the issue. Have you talk to Mellonox about this issue it can also be the firmware on card itself

On Oct 7, 2014, at 3:57 PM, Zoltan Podlovics notifications@github.com<mailto:notifications@github.com> wrote:

Please note SR-IOV also require IOMMU support, and it seems the same ConnectX-2 chips/cards (MT26428 in the first publication and MHRH19B-XTR on the second) are working fine with other IOMMU implementations. In fact, almost a year ago I already tested the same card type with another IOMMU implementation on Scientific Linux and worked fine. Maybe a driver issue, maybe a bios issue, I don't know. But still, I need is a working and stable system.

Adit Ranadive, Ada Gavrilovska, Karsten Schwan: Distributed Resource Exchange: Virtualized Resource Management for SR-IOV InfiniBand Clusters http://www.cc.gatech.edu/~adit262/docs/DRX-Cluster2013.pdfhttp://www.cc.gatech.edu/%7Eadit262/docs/DRX-Cluster2013.pdf

Tiago Pais Pitta de Lacerda Ruivo, Gerard Bernabeu Altayo, Gabriele Garzoglio, Steven Timm, Hyun Woo Kim, Seo-Young Noh, Ioan Raicu: Efficient High-Performance Computing with Infiniband Hardware Virtualization http://datasys.cs.iit.edu/reports/2014_IIT_virtualization-fermicloud.pdf

— Reply to this email directly or view it on GitHubhttps://github.com/HSAFoundation/HSA-Drivers-Linux-AMD/issues/6#issuecomment-58260529.

zpodlovics commented 10 years ago

Thanks for your help, I really appreciate it. I have created a thread on the Mellanox Community site, the thread are available here:

http://community.mellanox.com/thread/1820

zpodlovics commented 10 years ago

Based on this lkml thread [1] [2] and the following message may indicate a hardware bug, based on the message from Jörg Rödel.

AMD-Vi: Completion-Wait loop timed out

"I think it falls under Erratum 455 (which does not mention IOMMU specifically). Point is, there is a hardware workaround for this to make the IOMMU work, but your BIOS does not enable it."

[1] https://lkml.org/lkml/2013/1/20/24 [2] https://lkml.org/lkml/2013/1/20/46

gstoner commented 10 years ago

I have inquiry into our engineering team on this.

greg On Oct 8, 2014, at 3:18 PM, Zoltan Podlovics notifications@github.com<mailto:notifications@github.com> wrote:

Based on this lkml thread [1] [2] and the following message may indicate a hardware bug, based on the message from Jörg Rödel.

AMD-Vi: Completion-Wait loop timed out

"I think it falls under Erratum 455 (which does not mention IOMMU specifically). Point is, there is a hardware workaround for this to make the IOMMU work, but your BIOS does not enable it."

[1] https://lkml.org/lkml/2013/1/20/24 [2] https://lkml.org/lkml/2013/1/20/46

— Reply to this email directly or view it on GitHubhttps://github.com/HSAFoundation/HSA-Drivers-Linux-AMD/issues/6#issuecomment-58420409.

zpodlovics commented 9 years ago

The bug looks fixed now, thanks to the new BIOS (version: 2102, date: 01/21/2015) probably some errata applied there. Unfortunately I ran into a firmware update failure before with this card, but now it's recovered. Surprisingly before the BIOS update the machine was not able to boot Windows 2012 with the newest drivers that still support CX2 with IOMMU enabled. The machine now boots Linux and Windows without issue and the card works as expected both on IB and ETH mode.

Thanks for your help and keep up the good work!