thinkoco / c5soc_opencl

DE1SOC DE10-NANO DE10-Standard OpenCL hardware that support VGA and desktop. And Some applications such as usb camera YUYV to RGB , Sobel and so on.
Apache License 2.0
90 stars 39 forks source link

Support for Intel OpenCL SDK 17.1 #4

Closed META-DREAMER closed 6 years ago

META-DREAMER commented 6 years ago

Hi there, I am trying to get this to work with OpenCL SDK 17.1 on my de10-nano board, but am having trouble. I am trying to run this project: https://github.com/doonny/PipeCNN.

I was able to successfully get it working using the OpenCL image for the de10 nano provided by terasic, but unfortunately that image doesnt support HDMI and ethernet. I cross-compiled the PipeCNN kernel and host using Intel OpenCL SDK 17.1 and the de10-nano BSP on my desktop and transferred to the de10-nano board. It successfully ran there, even though it had the 16.1 binaries.

I then tried using this repo's BSP and followed the instructions here, recompiling the PipeCNN kernel with the de10_nano_sharedonly_hdmi BSP from this repo. However, it would give this error: libintel_soc32_mmd.so cannot opened shared object file: no such file or directory and various other errors. Im guessing this error is because that file used to be called libalterammdpcie.so but intel changed a bunch of the names in 17.1. So, I tried recompiling the kernel drivers for 17.1 (zImage, socfpga.dtb, and opencl.rbf) as well as getting the latest aocl binaries from intel for 17.1. I was finally able to get PipeCNN to run, however the execution gets stuck on layer 1. You can read more about that here: https://github.com/doonny/PipeCNN/issues/53.

Do you know what steps I'm missing to get this working with 17.1? Unfortunately I cant do 16.1 because of licensing issues that were not able to be resolved with the eth0 fix. Intel removed the requirement for licensing in 17.1 when using aoc, so thats the only choice I have.

thinkoco commented 6 years ago

@hammadj I think it's my fault. I'm checking it now. Maybe, there are something wrong with my Linux image. I will update a image for you later. As my license is expiration, I can't compile the kernel for testing. Can you do this work for me? Recompiling the PipeCNN kernel with my de10_nano_sharedonly_hdmi BSP ,then run it on the terasic de10_nano opencl linux image ( just repalce opencl.rbf and cnn.aocx) . See if it works without desktop by my BSP.

META-DREAMER commented 6 years ago

@thinkoco Ok I will test that config out and let you know in a few hours. Thanks for your help, its much appreciated!

BTW if you use the 17.1 SDK you don't need license anymore. See here: https://www.altera.com/documentation/ewa1412772636144.html

Removed requirement for Intel® FPGA SDK for OpenCL™ license. You can also now run your OpenCL kernel without a paid runtime license

thinkoco commented 6 years ago

@hammadj Thank you for mentioning the license issue.

META-DREAMER commented 6 years ago

@thinkoco

Ok so I flashed that linux image, sourced the ini_opencl.sh, and ran the boardtest. It got stuck at this part: image

Not sure if it matters, but the name of the device with that linux image was DE10-STANDARD, not DE10-NANO.

Also, I tried what you said in your previous comment, compiling PipeCNN with your BSP and running it on the no-desktop version from terasic, and I got the same error: libintel_soc32_mmd.so cannot opened shared object file: no such file or directory

thinkoco commented 6 years ago

@hammadj ~~~Maybe ,I had build the boardtest with wrong command,so it get this result. No need to care about DE10-STANDARD string. Here are some 17.1 binaris that i fix the libintel_soc32_mmd.so error. you can update the sd card image(180302.img) wtih these files, then sudo apt-get install libsdl2-dev and run the host.run.~~~ I'm very sorry that there is no board to test these binaris now.

thinkoco commented 6 years ago

@hammadj The BSP is OK , but there is something wrong with my sd card iamge. I have fixed this issue.U can run with the latest SD card image here . Read the readme.txt in sd card first! opencl runtime is 17.1 default

nocduro commented 6 years ago

@thinkoco Thank you for your help on this!

I've been working with Hammad on it, and your updated SD card image is working on our board now!

nocduro commented 6 years ago

@thinkoco How did you fix your linux image? I'm trying to get a webcam to work with your new image, but it doesn't look like the webcam drivers were enabled when the image was built? I tried using my own zImage that I compiled with webcam support, but then when I run PipeCNN it freezes like before.

thinkoco commented 6 years ago

@nocduro Hi, The zImage in my sd card image contains uvc driver and colorApp is OK. Also, you can checkout the 3.18 branch in my linux-socfpga repo to rebuild zImage.

git checkout -b socfpga-opencl_3.18 origin/socfpga-3.18
cp config_opencl_de10_nano .config

export ARCH=arm
export CROSS_COMPILE=arm-linux-gnueabihf-
export LOADADDR=0x8000

make zImage

After this, remember to rebuild the OpenCL driver. copy the runtime foldr to PC, go to the driver folder

make KDIR=../(to linux-socfpga folder)

Then, copy the aclsoc_drv.ko back to sd card and cover the original one.

roddregs commented 6 years ago

Hi @thinkoco . I would like to experiment running PipeCNN using De0-Nano-SoC board. Do you think the configurations of PipeCNN will smoothly run in De0-Nano-SoC board? Thanks in advance for your response.

thinkoco commented 6 years ago

@roddregs Hi, You can compile the PipeCNN kernel with de0-nano-soc opencl bsp and check the report. I'm not sure if the minimum configuration of PipeCNN will be compiled successfully.There are just 40k LEs in de0-nano-soc(85k LEs in de1-soc, 110k LEs in de10-nano or standard). image

roddregs commented 6 years ago

@thinkoco Hi Sir, good day! Tried to compile PipeCNN in Deo-Nano-SoC board. However, I encountered the following problem when running the main makefile: g++ ./host/main.o ../common/ocl_util.o ../common/timer.o -o run.exe -L/thesis/intelFPGA/16.1/hld/board/de1soc/arm32/lib -L/thesis/intelFPGA/16.1/hld/host/arm32/lib -L/thesis/intelFPGA/16.1/hld/host/linux64/lib -Wl,--no-as-needed -lalteracl -lalterahalmmd -lalterammdpcie -lelf /usr/bin/ld: cannot find -lalterammdpcie collect2: error: ld returned 1 exit status Makefile:118: recipe for target 'run.exe' failed make: *** [run.exe] Error 1 Is this an indication that De0-Nano is not compatible with the PipeCNN configurations? Thank you very much for your reply in advance, Sir @thinkoco ...

thinkoco commented 6 years ago

@roddregs It's no finding alterammdpcie library Maybe the "g++" shuold be "arm-linux-gnueabihf-g++ " ,so you can check the OpenCL environment that you have set and change the Makefile (VENDOR = altera,PLATFORM = arm32,FLOW = hw ). The limit may be fpga logic resources, not the host.So,you can build the OpenCL kernel first and check the resource utilization in "acl_quartus_report.txt" or some other report files. About the hardware configrations,you can try this one (#define VEC_SIZE 4 , #define LANE_NUM 2 or 4 ?? // larger than 1 #define CONV_GP_SIZE_X 7)

roddregs commented 6 years ago

@thinkoco .. I am now using DE1-SoC board... When running main makefile provided, I got the following error: image Please, can you suggest any solutions to this error I encountered?... Thank you very much...

thinkoco commented 6 years ago

@roddregs
Hi, you can use arm-linux-gnueabihf-g++ 4.9 or 4.8 Best Regards!Knat

roddregs commented 6 years ago

@thinkoco Thank you very much @thinkoco, it solved the problem...

However, I got another problem when running main makefile : After compiling, it says: Error: cannot fit kernel(s) on device

image

The screenshot of the flow summary is the following: image

Please, any suggestion on how to solve this error? Thank you very much!

thinkoco commented 6 years ago

Hi,you can disable the "--profile" flagsin Makefile and rebuild it.Best Regards!Knat

roddregs commented 6 years ago

Thanks @thinkoco, I did it and it worked... Thanks soo much...

However, after the compilation, I have another error when I run the command: ./run.exe conv.aocx

image

Kindly help me how to solve this, please @thinkoco ... Thanks very much...

thinkoco commented 6 years ago

Hi,so,you may need to run it on de1soc,not on PCBest Regards!Knat

roddregs commented 6 years ago

Hi @thinkoco ... Thank you very much for your kind assistance, it worked....

root@socfpga:~# source ./init_opencl.sh
root@socfpga:~# aocl diagnose
aocl diagnose: Running diagnostic from /home/root/opencl_arm32_rte/board/c5soc/n

Verified that the kernel mode driver is installed on the host machine.

Using platform: Altera SDK for OpenCL
Board vendor name: Altera Corporation
Board name: de1soc_sharedonlyCyclone V SoC Development Kit

Buffer read/write test passed.

DIAGNOSTIC_PASSED
root@socfpga:~# ls
PipeCNN README init_opencl.sh opencl_arm32_rte swapper
root@socfpga:~/PipeCNN# aocl program /dev/acl0 conv.aocx aocl program: Running reprogram from /home/root/opencl_arm32_rte/board/c5soc/arn Reprogramming was successful!
root@socfpga:~/PipeCNN# aocl diagnose
aocl diagnose: Running diagnostic from /home/root/opencl_arm32_rte/board/c5soc/n

Verified that the kernel mode driver is installed on the host machine.

Using platform: Altera SDK for OpenCL
Board vendor name: Altera Corporation
Board name: de1soc_sharedonlyCyclone V SoC Development Kit

Buffer read/write test passed.

root@socfpga:~/PipeCNN# ./run.exe conv.aocx


PipeCNN: An OpenCL-Based FPGA Accelerator for CNNs


Platform: Altera SDK for OpenCL
Using 1 device(s)
Device 0: de1soc_sharedonlyCyclone V SoC Development Kit
Device OpenCL Version: OpenCL 1.0 Altera SDK for OpenCL, Version 16.0
Device Max Compute Units: 1
Device Max WorkGroup Size: 2147483647
Device Max WorkItem Size: 2147483647
Device Global Memory Size: 512 MBytes
Device Local Memory Size: 16 KBytes
Device Max Clock Freq: 1000 Mhz

Loading kernel/binary from file conv.aocx
Reprogramming device with handle 1

61063552 total weights read
1024 total output reference read

154587 bytes image data read from binary files

Executing Layer 1:

Launching single work-item kernel winbuffer

Launching single work-item kernel Conv

Launching single work-item kernel Pooling

Launching kernel MemWr with local size: 1, 1, 8 (global size: 27, 27, 96)

Launching kernel lrn with local size: 1, 1, 12 (global size: 27, 27, 12)

Executing Layer 2:

Launching single work-item kernel winbuffer

Launching single work-item kernel Conv

Launching single work-item kernel Pooling

Launching kernel MemWr with local size: 1, 1, 8 (global size: 13, 13, 256)

Launching kernel lrn with local size: 1, 1, 32 (global size: 13, 13, 32)

Executing Layer 3:

Launching single work-item kernel winbuffer

Launching single work-item kernel Conv

Launching kernel MemWr with local size: 1, 1, 8 (global size: 13, 13, 384)

Executing Layer 4:

Launching single work-item kernel winbuffer

Launching single work-item kernel Conv

Launching kernel MemWr with local size: 1, 1, 8 (global size: 13, 13, 384)

Executing Layer 5:

Launching single work-item kernel winbuffer

Launching single work-item kernel Conv

Launching single work-item kernel Pooling

Launching kernel MemWr with local size: 1, 1, 8 (global size: 6, 6, 256)

Executing Layer 6:

Launching single work-item kernel winbuffer

Launching single work-item kernel Conv

Launching kernel MemWr with local size: 1, 1, 8 (global size: 1, 1, 4096)

Executing Layer 7:

Launching single work-item kernel winbuffer

Launching single work-item kernel Conv

Launching kernel MemWr with local size: 1, 1, 8 (global size: 1, 1, 4096)

Executing Layer 8:

Launching single work-item kernel winbuffer

Launching single work-item kernel Conv

Launching kernel MemWr with local size: 1, 1, 8 (global size: 1, 1, 1024)

Copyed all batched results from fc_2 buffers.
Selected item = 0 from the combined batch results in fc buffers

Start verifying results ...

Check Pass !!!

The inference result is n02123045 tabby, tabby cat (the prob is 56.00)

PipeCNN exited !!!


Performance Summary

Kernel runtime summary:
Layer-1:
MemRd: 43.500 ms
Conv : 43.337 ms
Pool : 43.267 ms
MemWr: 43.153 ms
Lrn : 1.244 ms
Layer-2:
MemRd: 35.208 ms
Conv : 35.072 ms
Pool : 35.015 ms
MemWr: 34.924 ms
Lrn : 0.430 ms
Layer-3:
MemRd: 23.886 ms
Conv : 23.774 ms
Pool : 0.000 ms
MemWr: 23.704 ms
Lrn : 0.000 ms
Layer-4:
MemRd: 17.955 ms
Conv : 17.843 ms
Pool : 0.000 ms
MemWr: 17.772 ms
Lrn : 0.000 ms
Layer-5:
MemRd: 12.079 ms
Conv : 11.948 ms
Pool : 11.893 ms
MemWr: 11.806 ms
Lrn : 0.000 ms
Layer-6:
MemRd: 14.468 ms
Conv : 14.359 ms
Pool : 0.000 ms
MemWr: 14.289 ms
Lrn : 0.000 ms
Layer-7:
MemRd: 6.537 ms
Conv : 6.427 ms
Pool : 0.000 ms
MemWr: 6.358 ms
Lrn : 0.000 ms
Layer-8:
MemRd: 1.753 ms
Conv : 1.643 ms
Pool : 0.000 ms
MemWr: 1.574 ms
Lrn : 0.000 ms

Total kernel runtime 154.403 ms
Batch size = 1, average process time per batch: 154.403 ms

Total runtime: 0.159448s

roddregs commented 6 years ago

Hi @thinkoco .. I am planning to experiment the PipeCNN using my own dataset. I am thinking of using Caffe for training my network model.. Do you have any idea or suggestions on how will I make it successfully? Thank you very much for always sharing your expertise and time...

thinkoco commented 6 years ago

@roddregs Hi, reddegs. I 'm new to CNN.Maybe, you can ask doonny for help.Sorry for no useful suggestion.

Enshall commented 6 years ago

Dear @thinkoco When I run PipeCNN on de1soc with the image --c5soc_opencl_lxde_all_in_one_180317.img, VGA output will shut down, both direct connect or serial port comand in. when I compile the bitstream file and host file, I replaced the bsp with your de1soc_sharedonly_vga. Can you give me some guides? Best wishes for you!

thinkoco commented 6 years ago

Hi,you may add the CL_CONTEXT_COMPILER_MODE_ALTERA=3(sdk16.1)or CL_CONTEXT_COMPILER_MODE_INTELFPGA=3(sdk7.1) in the initial shell file. Also,you can export the flags before you run host.Best Regards!Knat

Enshall commented 6 years ago

Thank you for your quick reply @thinkoco sincerely It will display error : CL_INVALID_KERNEL_NAME when I add "CL_CONTEXT_COMPILER_MODE_ALTERA=3" the source code shows below: const char *knl_name_memRd = "memRead"; knl_memRd[i] = clCreateKernel(program, knl_name_memRd, &status); my compiler is 16.1 and i init it with 16 when i run the host can you give some more advices !

thinkoco commented 6 years ago

@Enshall Hi, You may need to update the rbf file by the top.rbf which you build.Also the aocx and host need to be updated. You can do as follow steps

roddregs commented 6 years ago

Hi @thinkoco

As reflected in the post above. I was using DE1-SoC Cyclone V board and successfully compiled and run in the board the PipeCNN using PLATFORM=arm32 and flow=hw which only read ONE IMAGE file. Thanks for your assistance on that matter...

And now, I tried to enable OpenCV to experiment loading multiple images from Imagenet 2012 validation set. I set USE_OPENCV = 1 in the Makefile and installed OpenCV 2.4.9 in my PC (Ubuntu 16.04). When I compiled, I got the following error that says:

libopencv_core.so :file not recognized: File format not recognized

image

Any suggestion on how to solve this error? Is OpenCV 2.4.9 not compatible with the current set-up of PipeCNN?

Thank you very much for always answering my queries...

thinkoco commented 6 years ago

@roddregs You can install the opencv in the sd card image and build the host on de1-soc. Aslo,you may need to modify the OCV_INCLUDES and OCV_LIBDIRS in the Makefile.

Enshall commented 6 years ago

Thank you for your patiants ! @thinkoco Actually I can`t understand step 2 all the time.. Whats the "Windows fat32 partition" means ? And how to get a rbf file. I only downloaded the dtb file from your Google Drive. Best wished!

thinkoco commented 6 years ago

@Enshall When you building the kernel (make fpga), there will be a top.rbf file generated in the conv folder. The sdx1 is the fat32 partition which contains the rbf, zImage and dtb. image

Enshall commented 6 years ago

Hi @thinkoco , thank you a lot ! I fixed the error of 'CL_INVALID_KERNEL_NAME ' as what you taught me. You do a really great work here ! Thank you !

roddregs commented 6 years ago

Hi @thinkoco I used minicom in Ubuntu 16.04 to run and access de1-soc board.
You suggested to install OpenCV in sd card image and build the host on de1-soc. Please help me how to do it .... since I used the linux_sd_card_image.img from the de1-soc BSP of Terasic and it automatically configured my sd card into 2 partitions ( 1.1GB and 859 MB)... I don't know how to access other partitions of my sd card (I have 32GB) since i'ts already configured automatically.. There is an unused partition 27GB but I don't know how to access it in minicom. Is minicom not appropriate for this purpose? will i have to use other environment like LXDE?

thinkoco commented 6 years ago

@roddregs Hi,

  1. For sending files to de1soc,you can mount you sd card on Ubuntu PC and copy them to sd card ext4 partition, such as /media/rod/xxxxxxx/root
  2. For building host with opencv libs, you can cross compile the opencv on PC and build the host with the cross-compiled opencv libs. Then copy host and opencv libs to sd card. Also, you can copy the source code of opencv libs and pipeCNN to sd card,then build them on de1soc. Actually, I had not run PipeCNN with OpenCV before. So,there are no detailed steps here. You may try these work yourself or ask doonny for help.
  3. Some opencv applications can run without gui. So, you may check that if the PipeCNN works without gui.
thinkoco commented 6 years ago

@roddregs You can build the PipeCNN host with opencv libs on DE1SOC .

  1. copy the PipeCNN source code to all_in_one_180317 sd card.
  2. Power on DE1SOC with the all_in_one_180317 image sd card.
  3. source init_opencl_17.1.sh
  4. modify the Makefile like this: image
  5. make host

Also, you may change the picture_file_path_head path in main.cpp. and export DISPLAY=:0 before running host in minicom terminal.

pratik18v commented 6 years ago

Hi @thinkoco , I am trying to run the PipeCNN code on the de10 standard board using the image (c5soc_opencl_lxde_all_in_one_180317.img) and BSP you have provided. I was able to successfully compile the code but I am facing an issue similar to an earlier comment in this thread. When I run the command aocl program /dev/acl0 conv.aocx , the display turns off and I loose all connections (serial/SSH) to the board. Your previous fix (CL_CONTEXT_COMPILER_MODE_ALTERA=3) is already included in the init file, but its not helping.

Do you know what may cause it and/or how to solve it? Thank you

thinkoco commented 6 years ago

After building the PipeCNN,you will get a top.rbf.Then,update the opencl.rbf with the top.rbf and keep the name 'opencl.rbf'. the aocl program is no need. just source the initial file and run host directly.

rahimijan commented 5 years ago

@thinkoco I downloaded ("https://github.com/thinkoco/c5soc_opencl/issues/4#issuecomment-371899037 "),

but, its Linux doesn't boot up! In fact, I have just these pictures:

thuml thumbnail

Do you know what the problem is? and, can I ask you, if it is possible, test Linux of above link for me, please, I need to work with it. Also, I want to install opencl on 'Linux LXDE', but I don't know How should I do it. Please, if you know, help me to solve these issues.

Unfortunately, this Linux can't boot up from sd card (in DE10-Nano board), if you know the solution or another way to booting it up, Please let me know, thank you.

sergio14890 commented 4 years ago

Hey guys! I have de1soc 17.1 bsp. Im try compile the pipeCNN , but i have this error:

c:/Users/sergi/Downloads/PipeCNN-master/project/device/conv_pipe.cl:1096:16: error: attribute takes one argument __attribute__((max_work_group_size(1,1,LRN_MAX_LOCAL_SIZE))) // (x,y,z) ^ 2 warnings and 1 error generated.

What im can do?

sergio14890 commented 4 years ago

@thinkoco Thank you very much @thinkoco, it solved the problem...

However, I got another problem when running main makefile : After compiling, it says: Error: cannot fit kernel(s) on device

image

The screenshot of the flow summary is the following: image

Please, any suggestion on how to solve this error? Thank you very much!

I have the same error, i dont can solve it. What you sugered?

sergio14890 commented 4 years ago

Do you know what may cause it and/or how to solve it?

After building the PipeCNN,you will get a top.rbf.Then,update the opencl.rbf with the top.rbf and keep the name 'opencl.rbf'. the aocl program is no need. just source the initial file and run host directly.

Hey @thinkoco i have the same error. But i don't find anyone file "opencl.rbf"

thinkoco commented 4 years ago

@sergio14890

  1. Hi,you can disable the "--profile" flags in Makefile and rebuild the opencl kernel
  2. then, you will get a top.rbf, rename it to opencl.rbf and copy it to micro sdcard partition 1 (FAT32 file system partition)
sergio14890 commented 4 years ago

@sergio14890

  1. Hi,you can disable the "--profile" flags in Makefile and rebuild the opencl kernel
  2. then, you will get a top.rbf, rename it to opencl.rbf and copy it to micro sdcard partition 1 (FAT32 file system partition)

Hey @thinkoco thank you so much for your answer! You really did a great job.

I did what you told me but it didn't work: |

First, i write c5soc_opencl_lxde_fpga_reconfigurable on SD card. Secound, i copy (top.rbf and rename to opencl.rbf) to fat32 and copy yet de1soc_socfpga.dtb and rename to socfpga.dtb.

But when I turn on the card, the screen remains off and only the putty works.

Do you have any more suggestions? Thanks for everything, good job

thinkoco commented 4 years ago

@sergio14890 Hi, have you set the MSEL[4:0] on your board to 01010 , SW10(1 to 6) on,off,on,off,on,N/A ? it shows the altera_load error code -4, you may check the opencl.rbf in the sdcard

sergio14890 commented 4 years ago

Yes @thinkoco msel is well configured. I think it has to do with the .rbf file itself, because I can connect the putty terminal and work well with it, only my screen is still missing the image.

There is no place in the code where I can add the definition service lightdm start before makefile?

image My putty terminal is connected but the screen stay off

sergio14890 commented 4 years ago

@thinkoco could you tell me which bsp you are using for de1soc to generate the .aocx file? I may be using a different one. Thanks for everything again

thinkoco commented 4 years ago

@sergio14890

  1. the BSP is de1soc_sharedonly_vga, build the opencl kernel with this bsp and replace the opencl.rbf with top.rbf ( which the vip core inside)

  2. here is the cmd to run desktop

sergio14890 commented 4 years ago

@thinkoco thanks for much for help! You know why i dont can use imshow on board?

When i do imshow, i receveid this warning: image

thinkoco commented 4 years ago

@sergio14890 sorry, opencv has not been tested on the image. and i serach the same error online. you can try it, but i'm not sure that it works .

sergio14890 commented 4 years ago

@thinkoco I already tried this solution. Except that there is a problem, the card cannot connect to the internet (prints below). Is there a way to download the .deb file and transfer it to the card and install there? image image

datnguyen263 commented 3 years ago

Dear @thinkoco, i tried follow to issues to fix my problem but i cant solve it. Can u help me with my problem? tks so much! image