nvdla / hw

RTL, Cmodel, and testbench for NVDLA
Other
1.71k stars 565 forks source link

How to combine FPGA and nvdla_SW #246

Open xiaoguoer opened 5 years ago

xiaoguoer commented 5 years ago

if I intergrated nvdla into FPGA successfully , can I just run "nvdla_runtime" to start inference with the FPGA conncted to hostPC ? is this feasible? Or, is nvdla_sw only designed for simulation ?? Than you ~

ghost commented 5 years ago

Running inference on FPGA is perfectly feasible. See this long thread https://github.com/nvdla/hw/issues/110 as an example. So far SoC's were tested (like Zynq UltraScale+) and mixed solutions like Virtex+Zynq. I think there were also attempts to run FPGA with communication over PCIe. I tested Zynq and the software integration is relatively straightforward (as soon as one fixes various minor issues).

Still, sadly, the main problem is lack of compilation tools for neural networks, that would run on small NVDLA architecture (nv_small or even nv_large). Some teams try to address this issue on their own like iCubeWork: https://github.com/icubecorp/nvdla_compiler - which anyway so far supports nv_full only. Without the compiler you have only little set of sanity tests, you could run with nvdla_runtime.

nvdla/sw of course works also in simulation environment, like QEMU based Virtual Platform (nvdla/vp).

xiaoguoer commented 5 years ago

@mmaciag thank you for your answer. if zynq was used, the steps i thought to run inference are:

  1. build kmd/umd according to the linux kernel;
  2. copy files which built in the step 1 to OS running on the zynq;
  3. run nvdla_runtime. am I right ?
ghost commented 5 years ago

Well, in general yes, but maybe not as simple as you described. There are minor things to polish. You definitely want to read about what people were already struggling at nvdla/sw, for example https://github.com/nvdla/sw/issues/46, https://github.com/nvdla/sw/issues/95 and https://github.com/nvdla/sw/issues/70

xiaoguoer commented 5 years ago

@mmaciag that's definitely what i need, thank you soooo much! the #110 following steps says we should "Add DW02_tree DW_lsb DW_minmax files" and "global define".

the official doc Designware Components says "DW02_tree, DW_lsd and DW_minmax can be obtained directly from the EDA vendors". Does Xilinx provide these components ?

And how to do "global define"?

ghost commented 5 years ago

You should check Vivado (if you work on Zynq) user manual to see how to define global defines. DW_xxx components are not needed when proper macros are defined. See CMAC implementation for reference. If you define FPGA macro, ordinary '*' operator is used instead of Wallace tree (DW02_tree)

xiaoguoer commented 5 years ago

@mmaciag okay, thanks a lot ~