doonny / PipeCNN

An OpenCL-based FPGA Accelerator for Convolutional Neural Networks
Apache License 2.0
1.22k stars 370 forks source link

The configuration of DE5a in the paper and RAM needed for syntheses #66

Closed myih closed 5 years ago

myih commented 6 years ago

Hi professor @doonny

I'm curious about the configuration (vector, lane) you used for DE5a to achieve 5ms for Alexnet in the paper. I tried to test it myself with my Arria10GX board, but with (vector=16, lane=64 or 48) the syntheses failed because my 32Gb RAM was not enough. Also, why didn't you use more resources with the DE5a when there's still resources left?

Thank you!

doonny commented 6 years ago

For arria10 device, you need at least 64GB RAM for your PC. And try to compile the design in flat mode. It is recommended by Intel.

myih commented 6 years ago

Thanks @doonny I compiled the aocx on another machine with more RAM, with (vector=16, lane=64) the runtime for Alexnet is 11.1ms on my Arria10GX. I use flat mode and no profile. Could you tell me what configuration can achieve 5ms for Alexnet? Thank you!

doonny commented 6 years ago

@myih Please use multiple batches, for example, batch-16.

laski007 commented 6 years ago

Dear @myih Do you mind to tell me what command should I use to specify "flat mode" when I compile? Thank you so much!

myih commented 6 years ago

@laski007 use flag --bsp-flow=flat

laski007 commented 6 years ago

Dear @myih Thank you so much!! BTW, do you mind to tell me or give me some links to introduce what is "flat mode" and its benefit?

laski007 commented 6 years ago

Dear Prof. Wang @doonny , do you mind to tell me what is flat mode and its benefit? Many thanks.

CrazyBingo commented 6 years ago

@laski007 have you ever resole it?? Altera I set:“VENDOR = amd; PLATFORM = x86”,the compile result is like this: image

but if I set :“VENDOR = altera; PLATFORM = arm32”,the compile result is like this: image

不管哪种方式,始终讲的是找不到*.lib,但明明文件就在这边: image

laski007 commented 6 years ago

@CrazyBingo 找不到庫是環境變量的設置問題,我感覺你大概需要把該路徑添加到系統的環境變量中。