doonny / PipeCNN

An OpenCL-based FPGA Accelerator for Convolutional Neural Networks
Apache License 2.0
1.24k stars 370 forks source link

ERROR: CL_INVALID_BINARY #57

Closed ArjunaDeSoysa closed 6 years ago

ArjunaDeSoysa commented 6 years ago

Dear Prof. @doonny , I use De1-SoC and Angstrom v2014.12 - Kernel , I changed, const char *vendor_name = "altera"; in derictory ../project/host/main.cpp because of when *vendor_name = "Intel" there was Error "Unable to find the desired OpenCL platform."

now there is new error;

./run.exe conv.aocx
***************************************************
PipeCNN: An OpenCL-Based FPGA Accelerator for CNNs
***************************************************
Platform: Altera SDK for OpenCL                                                           
Using 1 device(s)                                                                         
  Device 0: de1soc_sharedonlyCyclone V SoC Development Kit                                
Device OpenCL Version: OpenCL 1.0 Altera SDK for OpenCL, Version 16.0                     
Device Max Compute Units: 1                                                               
Device Max WorkGroup Size: 2147483647                                                     
Device Max WorkItem Size: 2147483647                                                      
Device Global Memory Size: 512 MBytes                                                     
Device Local Memory Size: 16 KBytes                                                       
Device Max Clock Freq: 1000 Mhz 

Loading kernel/binary from file conv.aocx 
ERROR: CL_INVALID_BINARY
Location: ../common/ocl_util.cpp:415
Failed to create program with binary

So i want to know, There are another things do i want to change ? (because of changing const char *vendor_name = "altera";)?

Thank You

ArjunaDeSoysa commented 6 years ago

Dear @thinkoco do you know how to slove this problem?

thinkoco commented 6 years ago

@ArjunaDeSoysa So, you have use the opencl sdk 16.0 runtime file and rebuild driver? It's recommend to use sdk 16.1 runtime file. Also, CMA, Altera FPGA firmware downlaod module, fpga-bridge ,socfpga bridge driver are needed. OpenCL driver should be rebuilt. The device tree should contains the fpga-bridge and fpga manager descriptions.and so on

ArjunaDeSoysa commented 6 years ago

@thinkoco yes i use opencl sdk 16.1 runtime file in host computer. But in terasic site there's only have BSP(Board Support Package) for Altera SDK OpenCL 16.0. i saw that someone say that it is okay for this pipeCNN.

thinkoco commented 6 years ago

@ArjunaDeSoysa Also, you can use my SD card image c5soc_opencl_lxde_all_in_one_180317.img. Just need to replace the dtb with no vga one here and delete the CL_CONTEXT_COMPILER_MODE_ALTERA=3 flag in the init_opencl16.1.sh file.

ArjunaDeSoysa commented 6 years ago

Okay I'll try that. Thank you very much @thinkoco.

thinkoco commented 6 years ago

@ArjunaDeSoysa here is the x2go setting. x2go setting check the host IP on board. login user: knat password: knat. use sudo su to get root privileges

ArjunaDeSoysa commented 6 years ago

@thinkoco , 1) Do i want to replace the socfpga_cyclone5_de1soc_novga.dtb with socfpga.dtb or de1soc_socfpga.dtb(inside the de1soc folder )?

2) I use UART to USB cable so , username : knat and password : knat right?

i am really sorry for the bothering again and again.

thinkoco commented 6 years ago

@ArjunaDeSoysa cover the socfpga.dtb with socfpga_cyclone5_de1soc_novga.dtb , and keep the socfpga.dtb name. x2go user is knat, root for uart.

ArjunaDeSoysa commented 6 years ago

@thinkoco , Still i got same error,

root@C5SoC:~# source init_opencl_16.1.sh                                        
root@C5SoC:~# cd cnn                                                            
root@C5SoC:~/cnn# ./run.exe conv.aocx                                           
******************************************                                      
An OpenCL-Based FPGA Accelerator for CNNs                                       
******************************************                                      

Platform: Intel(R) FPGA SDK for OpenCL(TM)                                      
Using 1 device(s)                                                               
  Device 0: de1soc_sharedonly_vga : Cyclone V SoC Development Kit               
Device OpenCL Version: OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 16.1
Device Max Compute Units: 1                                                     
Device Max WorkGroup Size: 2147483647                                           
Device Max WorkItem Size: 2147483647                                            
Device Global Memory Size: 512 MBytes                                           
Device Local Memory Size: 16 KBytes                                             
Device Max Clock Freq: 1000 Mhz                                                 

Loading kernel/binary from file conv.aocx                                       
ERROR: CL_INVALID_BINARY                                                        
Location: ../common/ocl_util.cpp:415                                            
Failed to create program with binary  

at the beginning(not only this time) there's not appear in this part below ;

61063552 total weights read 
154587 bytes image read 
1024 total output reference read 

i tried various thing. can you please help me to solve this?

ArjunaDeSoysa commented 6 years ago

@thinkoco , you are right. i check aocl board-xml-test in the host computer, the result is in given below;

arjuna@arjuna-Inspiron-3542:~$ source /etc/profile
arjuna@arjuna-Inspiron-3542:~$  aocl board-xml-test 
 board-path       = /home/arjuna/intelFPGA/16.1/hld/board/terasic/de1soc

 board-version    = 13.0

 board-name       = de1soc

 board-default    = de1soc

 board-hw-path    = /home/arjuna/intelFPGA/16.1/hld/board/terasic/de1soc/./de1soc

 board-link-flags = -L/home/arjuna/intelFPGA/16.1/hld/board/terasic/de1soc/arm32/lib -L/home/arjuna/intelFPGA/16.1/hld/host/arm32/lib

 board-libs       = -lalterammdpcie -lstdc++

 board-util-bin   = /home/arjuna/intelFPGA/16.1/hld/board/terasic/de1soc/arm32/bin

The bord-version is 13.0. As you said, that is the problem. i download DE1-SoC OpenCL BSP(Board Support Package) in terrasic site hear. I serched BSP 16.1 to download and i coulden't find it.

Can you please send me the link , where can i download DE1-SoC OpenCL BSP(Board Support Package) for 16.1 .?

thinkoco commented 6 years ago

@ArjunaDeSoysa it seems to be the board xml or hardware template version. It's not the SDK version. Before you building and running the host, check not setting the CL_CONTEXT_EMULATOR_DEVICE_ALTERA flag.Also ,some other flags.I have test the PipeCNN on my baord with my sd card image , It's OK. if I set CL_CONTEXT_EMULATOR_DEVICE_ALTERA = 1 and run, get you same error. Maybe, not your same issue, but you can check it.

doonny commented 6 years ago

you need firstly to reprogram the fpga manually on the DE1 platform

ArjunaDeSoysa commented 6 years ago

Prof. @doonny did you mean, run aocl program /dev/acl0 conv.aocx before ./run.exe conv.aocx ?

doonny commented 6 years ago

yes

doonny commented 6 years ago

you could also use aocl diagnose to verify that you have the right driver installed.

ArjunaDeSoysa commented 6 years ago

yes it worked. Thank you very much thinkoco and Prof. doonny

ghost commented 6 years ago

Dear @thinkoco , i Tried your sd image in 16.1. Then i get , source init_opencl_16.1.sh command type,

it get stuck. can you please help me to solve this issue?

Fnajjar commented 6 years ago

binary_error Hi @doonny , @thinkoco , I am using the OpenCL SDK v17.1, and having the same error. What shall I do??

ghost commented 6 years ago

Dear @doonny , @thinkoco , @Fnajjar When i compile the ./run.exe conv.aocx (two generate file from the host computer) in my arm based de1-soc FPGA, the error occurs,

Loading kernel/binary from file conv.aocx
ERROR: CL_INVALID_BINARY
Location: ../common/ocl_util.cpp:415
Failed to create program with binary

and in location : ../common/ocl_util.cpp:415 there is;

#ifdef FPGA_DEVICE
  cl_program program = clCreateProgramWithBinary(context, num_devices,devices,         binary_lengths,
    (const unsigned char **) binaries.get(), binary_status, &status);
  checkError(status, "Failed to create program with binary");

So i think the problem is inconst unsigned char **. Because I found some differences in x86 and arm32. Can you please help me to slove this problem?

thinkoco commented 6 years ago

@DasunBhanuka If you use the opencl-vga based bsp to build you kernel.see the steps here and known the limits first.So you need update the rbf file. if you use the terasic's bsp and my sd card image, you can update the dtb with the no vga one and update the rbf file with top.tbf(rename to opencl.rbf).Also, remove the CL_CONTEXT_COMPILER_MODE_ALTERA flag in the init_opencl_16.1.sh file.

thinkoco commented 6 years ago

@Fnajjar PCIE based board ? you can try to test the runtime with the "aocl diagnose" or "aocl program" command first.

ghost commented 6 years ago

@thinkoco @doonny @Fnajjar If I rename the folder 'de1soc_sharedonly' to 'de1soc', will the following get resolved?

Loading kernel/binary from file conv.aocx
ERROR: CL_INVALID_BINARY
Location: ../common/ocl_util.cpp:415
Failed to create program with binary
thinkoco commented 6 years ago

@DasunBhanuka No.

  1. check the conv.aocx for which device by "aocl binedit conv.aocx get .acl.board board.txt && cat board.txt"
  2. check the opencl.rbf for which device by "aocl diagnose" they must be same device name.
Fnajjar commented 6 years ago

Hey @thinkoco, @DasunBhanuka
I tried the instruction but still have the same error What do you think?

beniedit

ghost commented 6 years ago

@thinkoco @Fnajjar @aazz44ss @doonny

I am getting the exact same error just like this.. even though I am running OpenCl 16.1 Can you give us another idea?

thinkoco commented 6 years ago

@Fnajjar There may be some difference in conv.aocx

  1. aocl binedit conv.aocx list
  2. "aocl binedit conv.aocx get .acl.board_spec.xml board.xml && cat board.xml" or other file
Fnajjar commented 6 years ago

@thinkoco , actually i did the commandes and it seems that there 's no ".acl.board_spec.xml " acl_board

thinkoco commented 6 years ago

@Fnajjar nothing in the .acl.board ? Is this an emulater .aocx file? So,you want to run emulater for Arria 10 GX board on PC?

Fnajjar commented 6 years ago

@thinkoco, yes it is, is that possible to run emulater for Arria 10 GX board on PC? How shall I know is there anythiong in the .acl.board ? I tried "aocl binedit conv.aocx get .acl.board board.xml && cat board.xml" and I've got nothing

thinkoco commented 6 years ago

@Fnajjar It seem to be an emulater .aocx file, there is no .acl.fpga.bin in the conv.aocx. if emulater, try "export CL_CONTEXT_EMULATOR_DEVICE_ALTERA=1" before running host. This command(aocl binedit conv.aocx get .acl.board board.txt) will get .acl.board to board.txt. I seem that DE5a-Net is Arria 10 GX chip, but I have little experience for emulatering based on this chip.

Fnajjar commented 6 years ago

thank you @thinkoco , I did it and this time I have different error, is that because there is no "bin_file_r" What shall I do? please find the screenshot below! emula

thinkoco commented 6 years ago

@Fnajjar It's Ok,you need download the PipeCNN weight files here.

Fnajjar commented 6 years ago

@thinkoco, actually I have already uploaded them "/home/najjar/Documents/PipeCNN-master/data", besides of the data_vgg16 and data_alex *the problem is that it's not beeing detected or can't been found

data

thinkoco commented 6 years ago

@Fnajjar Also, data_alex is needed.see the host code and the path

const char *weight_file_path = "./data/data_alex/weights.dat";
const char *input_file_path = "./data/data_alex/image.dat";
const char *ref_file_path = "./data/data_alex/fc8.dat";
const char *dump_file_path = "./result_dump.txt";
ghost commented 6 years ago

Dear @thinkoco

How to check this in fpga,

aocl binedit conv.aocx list "aocl binedit conv.aocx get .acl.board_spec.xml board.xml && cat board.xml

as you said..

Fnajjar commented 6 years ago

@thinkoco thank you very much for your help acctually it works! I already uploded the data_alex. I moved the files "conv.aocx" and the "run.exe" in the folloing path "/home/najjar/Documents/PipeCNN-master" and it worked! 1 2

ArjunaDeSoysa commented 6 years ago

@Fnajjar if you able to complete the HW emulation also you can help to @DasunBhanuka to get it work. cause you both have same problem right. Thank You.

ghost commented 6 years ago

@Fnajjar Can you please explain how you made it work, after that error..

Fnajjar commented 6 years ago

@DasunBhanuka, well when I had the "ERROR: CL_INVALID_BINARY" I just used the command line "export CL_CONTEXT_EMULATOR_DEVICE_INTELFPGA" and then, it still couldn't find the weights even though, I had the data in the wright directories. So, I just moved the two files "conv.aocx" and the "run.exe" in the folloing path "/home/user/Documents/PipeCNN-master" and it worked!

final

ghost commented 6 years ago

@Fnajjar Will you also try this using a FPGA hardware?

Fnajjar commented 6 years ago

well now since it's working on emulator, I am going to implement it on a Attila Arria 10 Gx but, I am trying to know how to do that? Do you have an idea??

ghost commented 6 years ago

@Fnajjar I am currently trying to implement it on Altera DE1SoC. If it works, I'll tell you..

thinkoco commented 6 years ago

@DasunBhanuka Sorry for no indicating that aocl diagnose on de1soc and aocl binedit on PC

ghost commented 6 years ago

Dear @thinkoco can you please explain how to do aocl diagnose on de1soc and aocl binedit.. It would be a great help..

thinkoco commented 6 years ago

@DasunBhanuka 1.you can use "aocl diagnose" command after sourcing the opencl initial shell file on de1soc.

  1. then "aocl binedit conv.aocx list" and "aocl binedit conv.aocx get .acl.board board.txt && cat board.txt" on PC
  2. both the device names in step 1 and step 2 should be the same one.
ghost commented 6 years ago

Dear @thinkoco I did as you said.. screenshot from 2018-04-24 22-17-10 Here's the result

ghost commented 6 years ago

Dear @thinkoco why do you think I get the error No such file or directory Or Am I missing something here?

thinkoco commented 6 years ago

@DasunBhanuka Hi, you can cd to the folder which contians conv.aocx first .Also,read permissions of this file

ghost commented 6 years ago

@thinkoco Can you send me the files you created for DE1SoC? conv.aocx & run.exe Will they be compatible with my board?

doonny commented 6 years ago

I suggest first trying to run the official examples from Terasic. If the examples are ok, PipeCNN should be the same.