doonny / PipeCNN

An OpenCL-based FPGA Accelerator for Convolutional Neural Networks
Apache License 2.0
1.26k stars 370 forks source link

DE1-SoC parameters #95

Closed 0x47 closed 6 years ago

0x47 commented 6 years ago

Hello,

In the README.md file it says that the execution time for AlexNet on the DE1-SoC is 150 ms. Can you provide the configuration for the hw_param.cl file to achieve 150 ms? I tried with VEC_SIZE = 4, LANE_NUM = 4, CONV_GP_SIZE_X = 7 and PIPE_DEPTH = 6 but I can only achieve 351.607 ms. Increasing LANE_NUM further (e.g. 16 as in the default setting) is not possible because of the low hardware resources of the DE1-SoC.

aazz44ss commented 6 years ago

It also depend on Fmax and batch size you use. What is your Fmax?

0x47 commented 6 years ago

The batch size is 1 and the fmax is 139.56 MHz taken from the quartus_sh_compile.log file.

EDIT: I also found this old reply: https://github.com/doonny/PipeCNN/issues/37#issuecomment-355522000 which states

VEC_SIZE = 8
LANE_NUM = 8
CONV_GP_SIZE_X = 7

However, when trying these values the estimate is already way above the limits:

+--------------------------------------------------------------------+
; Estimated Resource Usage Summary                                   ;
+----------------------------------------+---------------------------+
; Resource                               + Usage                     ;
+----------------------------------------+---------------------------+
; Logic utilization                      ;  148%                     ;
; ALUTs                                  ;   98%                     ;
; Dedicated logic registers              ;   60%                     ;
; Memory blocks                          ;   12%                     ;
; DSP blocks                             ;   40%                     ;
+----------------------------------------+---------------------------;
aoc: First stage compilation completed successfully.
Compiling for FPGA. This process may take a long time, please be patient.
Error (170012): Fitter requires 3679 LABs to implement the design, but the device contains only 3207 LABs
0x47 commented 6 years ago

It turns out that removing the --profile parameter from the Makefile solves this issue and allows setting VEC_SIZE = 8 and LANE_NUM = 8. Thanks and have a nice day!