Closed nocduro closed 6 years ago
Currently modifying a .dts
file to follow: https://github.com/altera-opensource/linux-socfpga/blob/master/Documentation/devicetree/bindings/fpga/altera-hps2fpga-bridge.txt
Using register values from: https://www.altera.com/hps/cyclone-v/hps.html#topic/sfo1418687413697.html
edit: didn't work. trying 3.x kernel now
Looks like Intel doesn't distribute software with known exploits, so they nuked all of the 3.x kernels from their repo because of meltdown / spectre exploits... Luckily thinkoco has a fork in their repository that is still up.
I thought I almost got the 4.15 kernel to work, but all of my OpenCL programs 'freeze' when running. The weird thing is that aocl diagnose
works perfectly! All it does is just check that it can make OpenCL buffers and read write to the OpenCL device. The buffers work, but the OpenCL kernel's (different from linux kernel, they are basically functions) do not work? They just sit there and never return.
The way FPGAs are handled in Linux changed at some point, so the reprogram
binary provided with the intel rte tools is unable to disable/re-enable the fpga bridges in 4.x. Bridges are how the hps communicates to the fpga, and their are 4 'built-in' bridges:
And these are listed under /sys/class/fpga_bridge/[br0 | br1 | br2 | br3]
There is also /sys/class/fpga_manager/fpga0
:
$ cat /sys/class/fpga_manager/fpga0/name
Altera SOCFPGA FPGA Manager
$ cat /sys/class/fpga_manager/fpga0/state
operating
Here's some of the output from my test program when I don't set CL_CONTEXT_COMPILER_MODE_INTELFPGA
env variable:
root@DE10_NANO:~/ocl# ./opencl triv2.aocx
binary file len is: 2602196
platform name: Intel(R) FPGA SDK for OpenCL(TM)
Device(DeviceId(0xb54752a8))
platform found! Platform(PlatformId(0xb5475260)): "Intel(R) FPGA SDK for OpenCL(TM)"
Reprogramming device [0] with handle 1
sh: 1: cannot create /sys/class/fpga-bridge/fpga2hps/enable: Directory nonexistent
sh: 1: cannot create /sys/class/fpga-bridge/hps2fpga/enable: Directory nonexistent
sh: 1: cannot create /sys/class/fpga-bridge/lwhps2fpga/enable: Directory nonexistent
Couldn't open FPGA status from /sys/class/fpga/fpga0/status!
sh: 1: cannot create /sys/class/fpga-bridge/fpga2hps/enable: Directory nonexistent
sh: 1: cannot create /sys/class/fpga-bridge/hps2fpga/enable: Directory nonexistent
sh: 1: cannot create /sys/class/fpga-bridge/lwhps2fpga/enable: Directory nonexistent
Reprogram FAILED
mmd program_device: Board reprogram failed
mem_rd_que: Device(DeviceId(0xb54752a8)) - OpenclVersion { ver: [1, 0] }
MMD FATAL: acl_mmd.cpp:59: can't find handle -1 -- aborting
opencl: acl_mmd.cpp:59: ACL_MMD_DEVICE* get_mmd_device(int): Assertion `0' failed.
With CL_CONTEXT_COMPILER_MODE_INTELFPGA=1
:
root@DE10_NANO:~/ocl# ./opencl triv2.aocx
binary file len is: 2602196
platform name: Intel(R) FPGA SDK for OpenCL(TM)
Device(DeviceId(0xb54752a8))
platform found! Platform(PlatformId(0xb5475260)): "Intel(R) FPGA SDK for OpenCL(TM)"
mem_rd_que: Device(DeviceId(0xb54752a8)) - OpenclVersion { ver: [1, 0] }
buffer created
kernel created
kernel sent to opencl
And it just sits there.
With CL_CONTEXT_COMPILER_MODE_INTELFPGA=3
(I think I have the correct .rbf file for this mode loaded):
root@DE10_NANO:~/ocl# ./opencl triv2.aocx
binary file len is: 2602196
platform name: Intel(R) FPGA SDK for OpenCL(TM)
Device(DeviceId(0xb56752a8))
platform found! Platform(PlatformId(0xb5675260)): "Intel(R) FPGA SDK for OpenCL(TM)"
mem_rd_que: Device(DeviceId(0xb56752a8)) - OpenclVersion { ver: [1, 0] }
buffer created
kernel created
kernel sent to opencl
same as with =1
😕
aocl diagnose
output:
root@DE10_NANO:~/ocl# aocl diagnose
Verified that the kernel mode driver is installed on the host machine.
Using platform: Intel(R) FPGA SDK for OpenCL(TM)
Board vendor name: Intel(R) Corporation
Board name: de10_nano_sharedonly_hdmi : Cyclone V SoC Development Kit
Buffer read/write test passed.
DIAGNOSTIC_PASSED
Call "aocl diagnose <device-names>" to run diagnose for specified devices
Call "aocl diagnose all" to run diagnose for all devices
clinfo
output:
root@DE10_NANO:~/ocl# clinfo
Number of platforms 1
Platform Name Intel(R) FPGA SDK for OpenCL(TM)
Platform Vendor Intel(R) Corporation
Platform Version OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 17.1
Platform Profile EMBEDDED_PROFILE
Platform Extensions cl_khr_byte_addressable_store cles_khr_int64 cl_intelfpga_live_object_tracking cl_intelfpga_compiler_mode cl_khr_icd cl_khr_3d_image_writes
Platform Extensions function suffix IntelFPGA
Platform Name Intel(R) FPGA SDK for OpenCL(TM)
Number of devices 1
Device Name de10_nano_sharedonly_hdmi : Cyclone V SoC Development Kit
Device Vendor Intel(R) Corporation
Device Vendor ID 0x1172
Device Version OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 17.1
Driver Version 17.1
Device OpenCL C Version OpenCL C 1.0
Device Type Accelerator
Device Profile EMBEDDED_PROFILE
Max compute units 1
Max clock frequency 1000MHz
Max work item dimensions 3
Max work item sizes 2147483647x2147483647x2147483647
Max work group size 2147483647
Preferred work group size multiple <getWGsizes:518: create kernel : error -46>
Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 0 / 0 (n/a)
float 1 / 1
double 0 / 0 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 64, Little-Endian
Global memory size 536870912 (512MiB)
Error Correction support No
Max memory allocation 134217728 (128MiB)
Unified memory for Host and Device No
Minimum alignment for any data type 1024 bytes
Alignment of base address 8192 bits (1024 bytes)
Global Memory cache type Read-Only
Global Memory cache size <printDeviceInfo:89: get CL_DEVICE_GLOBAL_MEM_CACHE_SIZE : error -30>
Global Memory cache line 0 bytes
Image support Yes
Max number of samplers per kernel 32
Max 2D image size 16384x16384 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 128
Max number of write image args 128
Local memory type Local
Local memory size 16384 (16KiB)
Max constant buffer size 134217728 (128MiB)
Max number of constant args 8
Max size of kernel argument 256
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Device Available Yes
Compiler Available No
Device Extensions cl_khr_byte_addressable_store cles_khr_int64 cl_intelfpga_live_object_tracking cl_intelfpga_compiler_mode cl_khr_icd cl_khr_3d_image_writes
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Intel(R) FPGA SDK for OpenCL(TM)
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [IntelFPGA]
clCreateContext(NULL, ...) [default] Success [IntelFPGA]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) Success (1)
Platform Name Intel(R) FPGA SDK for OpenCL(TM)
Device Name de10_nano_sharedonly_hdmi : Cyclone V SoC Development Kit
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Intel(R) FPGA SDK for OpenCL(TM)
Device Name de10_nano_sharedonly_hdmi : Cyclone V SoC Development Kit
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.8
ICD loader Profile OpenCL 1.2
NOTE: your OpenCL library declares to support OpenCL 1.2,
but it seems to support up to OpenCL 2.1 too.
uname -a
output:
root@DE10_NANO:~/ocl# uname -a
Linux DE10_NANO 4.15.0 #1 SMP Sun Mar 25 15:46:34 DST 2018 armv7l armv7l armv7l GNU/Linux
Boot sequence full output: https://gist.github.com/nocduro/87928861657963c7ddb3f46f4bbc4255
Interesting part (line 176):
[ 1.103077] fpga_manager fpga0: Altera SOCFPGA FPGA Manager registered
[ 1.110185] altera_hps2fpga_bridge sopc@0:fpgabridge@0: fpga bridge [hps2fpga] registered
[ 1.118596] altera_hps2fpga_bridge sopc@0:fpgabridge@1: fpga bridge [lwhps2fpga] registered
[ 1.127101] altera_hps2fpga_bridge sopc@0:fpgabridge@2: fpga bridge [fpga2hps] registered
[ 1.135622] altera_fpga2sdram_bridge sopc@0:fpgabridge@3: fpga bridge [fpga2sdram] registered
[ 1.144138] altera_fpga2sdram_bridge sopc@0:fpgabridge@3: driver initialized with handoff 000001ff
I'm a bit lost on what to do with the 4.x stuff. We are given the source for reprogram.c
in the rte folder, do I fix it to work with the fpga manager? Is something setup wrong in the device tree preventing the Linux kernel communicating with the OpenCL kernel on the FPGA? Why does a zImage build of 4.9.78 hange at the Starting kernel ...
part?
@nocduro Great work on dissecting this all. It almost seems like we are falling deeper into the rabbit hole right now. Do you see any signs of hope with all this? The thinkoco guy seemed to know what was up when it came to getting HDMI + OpenCL working, maybe he can give some more insight for webcam stuff as well?
@hammadj I think there is a chance to get the 4.x linux kernel to work, but we'd probably have to build everything from source with the yocto/bitbake stuff. At this point it's probably easier just to get the webcam drivers installed on that image from thinkoco.
I would like to try the full yocto build with the 4.9.78-ltsi kernel from here which was updated about a week ago, this way we could write an app note on doing that. It would also mean we are up to date on the "official" supported version of linux for the Cyclone V stuff.
If I don't get that to work tonight, I'll just install the webcam drivers on the thinkoco 3.x kernel, hopefully that won't be too hard.
Gave up on 4.x for now. Now running thinkoco's 3.18 Linux image, with my test program running. Now trying to get PS3 eye drivers installed, looks like it uses the ov534 chipset/driver
edit: this image wasn't compiled with the gspca drivers
$ root@DE10_NANO:~# cat /lib/modules/4.5.0-00185-g3bb556b/modules.builtin | grep media
kernel/drivers/media/usb/uvc/uvcvideo.ko
kernel/drivers/media/v4l2-core/videodev.ko
kernel/drivers/media/v4l2-core/v4l2-common.ko
kernel/drivers/media/v4l2-core/v4l2-dv-timings.ko
kernel/drivers/media/v4l2-core/videobuf2-core.ko
kernel/drivers/media/v4l2-core/videobuf2-v4l2.ko
kernel/drivers/media/v4l2-core/videobuf2-memops.ko
kernel/drivers/media/v4l2-core/videobuf2-vmalloc.ko
I've compiled just the drivers by cloning the 3.18 repo, running make menuconfig
to select the required media drivers/usb support/ov534 support by following these sites:
http://wiki.tekkotsu.org/index.php/Sony_PlayStation_Eye_driver_install_instructions https://askubuntu.com/questions/168279/how-do-i-build-a-single-in-tree-kernel-module
I've copied the drivers over, but the Linux image seems to be looking in the wrong place potentially. I'll go to the lab tomorrow to try with the camera.
Hmm, getting a segfault when loading the ov534 driver :(
Is there another camera we could use that would make the process easier?
No, all the usb camera drivers are in the same folder, so it should be the same difficulty for any of them:
https://github.com/altera-opensource/linux-socfpga/tree/socfpga-4.9.78-ltsi/drivers/media/usb/gspca
Thinkoco had some recommendations on how to fix, I'll do that tonight. (last night I tried, but messed up in the kernel configuration somewhere)
I've compiled a new 3.18 kernel with a modified .config file based off of the config_opencl_de10_nano
from https://github.com/thinkoco/linux-socfpga/tree/socfpga-3.18
I've replaced the original zImage provided by thinkoco, and have tested on the board. It boots and runs my openCL test program. Just need to test with a webcam.
steps to reproduce (extended from here):
# done on a fresh install of ubuntu 16.04 on a 24 vCPU instance
sudo apt update
sudo apt install u-boot-tools gcc-arm-linux-gnueabihf g++-arm-linux-gnueabihf libncurses5-dev make lsb uml-utilities git
# initial git clone is ~1.4 GB for everything
# could checkout single branch if wanted
# git clone --single-branch -b socfpga-opencl_3.18 https://github.com/thinkoco/linux-socfpga.git
git clone https://github.com/thinkoco/linux-socfpga.git
cd linux-socfpga
git checkout -b socfpga-opencl_3.18 origin/socfpga-3.18
# copy my modified config (I disable localversion appending, and enabled some usb webcams)
# available from gist above (will add to repo later)
cp 3.18_usbcam_config .config
# setup compiler options
export ARCH=arm
export CROSS_COMPILE=arm-linux-gnueabihf-
export LOADADDR=0x8000
export LOCALVERSION=
# make with 24 threads
make -j24 zImage
# copy the compiled image
cp arch/arm/boot/zImage ~/3.18_usbcam_zImage
[ 2.132814] usbcore: registered new interface driver ov534
[ 2.138341] usbcore: registered new interface driver ov534_9
Got this message during boot, so it looks like everything should work! Still have to test with an actual webcam though
I've been trying to get both of these drivers to work on Linux kernel version 4.15 from https://github.com/altera-opensource/linux-socfpga/tree/socfpga-4.15
I've gotten the kernel compiled and the
acl_drv.ko
compiled which is successfully loaded byinsmod acl_drv.ko
. After setting env variables for the 17.1 rte from Intel anaocl diagnose
command passes.However, when running an opencl program like the
vector_add
example from intel, the program freezes like it is unable to communicate with the fpga??I've tried modifying the .dts (device tree source, it tells the kernel where the devices are in memory like cpu, memory, ethernet, timers, etc.) file with not much luck
One of the other things is that the Linux kernel is kinda in the middle of adding a fpga manager, a generic interface which has changed from kernel version 3.x to 4.x. Intel doesn't seem to have up-to-date tooling/guides for the newer Linux kernels (?)
Possible fixes
starting kernel
. think it might have to do with the .dtb having fpga bridges, but they aren't enabled in the kernel?