Open WoutLegiest opened 9 months ago
@WoutLegiest can you post the bazel rule as you've currently set it up? Or perhaps push a draft PR with the changes so we can take a closer look?
I was reading some docs at https://xilinx.github.io/XRT/master/html/xrt_native_apis.html and see
XCL_DRIVER_DLLESPEC xrtDeviceHandle xrtDeviceOpenByBDF (const char * bdf) PCIe BDF identifying the device to open
One thing we can try is passing in the PCIE BDF (e.g., "0000:03:00.1"
from the docs)
A bit more digging, and I expect there is a way to communicate with a PCI-connected device via standard unix filesystem mounts.
Reading the Linux kernel docs, https://docs.kernel.org/PCI/sysfs-pci.html, I suspect the device might be accessible at one of
/sys/devices/xxxx
/sys/bus/pci/devices/xxxx
If this is mounted as a traditional UNIX file, you might be able to make it work by adding these /sys/
paths to bazel's sandbox_writeable_path
flag (maybe even adding --sandbox_writeable_path=/sys
, but first I'd check to see if you can locate the exact device's location on the file system)
You might also try running findmnt
to get the list of all mounted filesystems, and see if the FPGA is in there.
If that doesn't work, I wonder if we could find a minimal working example that I could set up with my machine, by plugging some other (cheap) FPGA into my desktop and tinkering with it. Would the XRT API work with every FPGA?
I already tried to run the xrtDeviceOpenByBDF
function, same result. Also found the device location in the /sys/
folder, added it to sandbox path without any outcome.
I found it was possible to run xbutil program
from an MLIR file, so talking to the FPGA from the bazel sandbox is possible and works correctly. Possibly the calling of Rust -> C -> FPGA will introduce the problem.
If that doesn't work, I wonder if we could find a minimal working example that I could set up with my machine, by plugging some other (cheap) FPGA into my desktop and tinkering with it. Would the XRT API work with every FPGA?
The XRT library is designed for the AMD Alveo cards, which are all PCIe cards that can be plugged into any pc. More specifically, the u55c, u250, u280 are cards with large FPGA on them, sadly none of them are cheap.
In a new end to end example, we try to evaluate tfhe-rs-bool code on an FPGA. To communicate with an Xilinx Alveo FPGA we use the XRT library. There is a problem during the initialisation of the FPGA device in the Rust code.
Note: All the functions that start with xrt are calls to the C-API of the XRT library.
Once this test is run in with
bazel test ...
, we got the following output:The problem already starts with the xrtDeviceOpen function, which won't execute. The XRT library is clearly started from the Rust code, but from then it is unclear what happens. The error message are generated by the XRT lib, while bazel and/or cargo cannot process them.