Open cmoore1776 opened 2 years ago
Thanks for reporting this. For reproduction step 3
fpga-load-local-image -S 0 -I agfi-xxxxxSOMExIDxxxxx
Does the image loaded specify a device ID as per https://github.com/aws/aws-fpga/blob/4750aacb4dac9d464b099b27e4337220cf0b0713/hdk/cl/examples/cl_dram_dma_hlx/README.md#create-example-design-gui ?
set ::env(device_id) "0xF001"
set ::env(vendor_id) "0x1D0F"
set ::env(subsystem_id) "0x1D51"
set ::env(subsystem_vendor_id) "0xFEDC"
For example, the cl_dram_dma example is configured to use 0xf001
If so, what device_id is specified.
Does the image loaded specify a device ID as per https://github.com/aws/aws-fpga/blob/4750aacb4dac9d464b099b27e4337220cf0b0713/hdk/cl/examples/cl_dram_dma_hlx/README.md#create-example-design-gui ?
Yes, 0xf001
is based on using the device_id provided in the example.
I think I understand the issue, so let me rephrase.
When following the steps in the HOW TO, setting a device ID of "0xF001"
and then running the udev permission script, the PCIe device does not have the permissions properly applied.
Therefore
Notes:
export AWS_FPGA_ALLOW_NON_ROOT=y
to the setup stepexport AWS_FPGA_SDK_OTHERS=y
to the setup stepHello @shamelesscookie ,
I have been trying to reproduce the issue you described, along with the fix in PR #561 . I haven't been able to reproduce the device permissions you list under step 4.
[centos@ip-172-31-83-184 ~]$ ls -lah /sys/devices/pci0000\:00/0000\:00\:1d.0/resource*
-r--r--r-- 1 root root 4.0K Jun 15 00:47 /sys/devices/pci0000:00/0000:00:1d.0/resource
-rw------- 1 root root 32M Jun 15 00:47 /sys/devices/pci0000:00/0000:00:1d.0/resource0
-rw------- 1 root root 2.0M Jun 15 00:47 /sys/devices/pci0000:00/0000:00:1d.0/resource1
-rw------- 1 root root 64K Jun 15 00:47 /sys/devices/pci0000:00/0000:00:1d.0/resource2
-rw------- 1 root root 64K Jun 15 00:47 /sys/devices/pci0000:00/0000:00:1d.0/resource2_wc
-rw------- 1 root root 128G Jun 15 00:47 /sys/devices/pci0000:00/0000:00:1d.0/resource4
-rw------- 1 root root 128G Jun 15 00:47 /sys/devices/pci0000:00/0000:00:1d.0/resource4_wc
[centos@ip-172-31-83-184 ~]$ sudo udevadm info -a -p /devices/pci0000:00/0000:00:1d.0 | grep "ATTR{device}"
ATTR{device}=="0xf001"
Are you using any environment variables that are not listed in your reproduction steps?
As a note, I have been using the public cl_dram_dma AGFI ( agfi-0b5c35827af676702
) with a PCI Device ID of 0xF001.
Hello!
Is there anything that AWS can help to resolve this issue? If the issue is resolved, we're curious to know the resolution.
Thank you!
Summary
The udev rule created by
add_udev_rules.sh
does not match the device ID used after loading an fpga image.The rule, which is deployed to
/etc/udev/rules.d/9999-presistent-fpga.rules
, only matches on:but it needs to also match on:
Reproduction steps
/sys/devices/pci0000:00/0000:00:1d.0/resource*
are 444Also note the device ID after loading the image:
Fix
Add the following two lines to
/etc/udev/rules.d/9999-presistent-fpga.rules
:After loading an image, permissions are 666: