Xilinx / CHaiDNN

HLS based Deep Neural Network Accelerator Library for Xilinx Ultrascale+ MPSoCs
Other
319 stars 151 forks source link

Build hardware for zc702 #93

Open saraballeri opened 5 years ago

saraballeri commented 5 years ago

Hi, I try to build hardware for zc702 platform with DIET_CHAI_Z using then only the convolution accelerator. I would use CHaiDNN on Pynq-Z1. Anyway, I got this error:

===>The following messages were generated while processing /home/user/CHaiDNN3/design/build/_sds/p0/vivado/prj/prj.runs/impl_1 : ERROR: [VPL 30-640] Place Check : This design requires more Slice LUTs cells than are available in the target device. This design requires 62741 of such cell types but only 53200 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device. Please set tcl parameter "drc.disableLUTOverUtilError" to 1 to change this error to warning. ERROR: [VPL 30-640] Place Check : This design requires more LUT as Logic cells than are available in the target device. This design requires 54951 of such cell types but only 53200 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device. Please set tcl parameter "drc.disableLUTOverUtilError" to 1 to change this error to warning. ERROR: [VPL 30-640] Place Check : This design requires more RAMB36/FIFO cells than are available in the target device. This design requires 162 of such cell types but only 140 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device. ERROR: [VPL 30-640] Place Check : This design requires more RAMB18 and RAMB36/FIFO cells than are available in the target device. This design requires 350 of such cell types but only 280 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device. ERROR: [VPL 30-640] Place Check : This design requires more RAMB36E1 cells than are available in the target device. This design requires 162 of such cell types but only 140 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device. ERROR: [VPL 30-99] Placer failed with error: 'Implementation Feasibility check failed, Please see the previously displayed individual error or warning messages for more details.' Please review all ERROR and WARNING messages during placement to understand the cause for failure. ERROR: [VPL 17-69] Command failed: Placer could not place all instances ERROR: [VPL 60-704] Integration error, problem implementing dynamic region, place_design ERROR ERROR: [VPL 60-806] Failed to finish platform linker ERROR: [SdsCompiler 83-5019] Exiting sds++ : Error when calling '/opt/Xilinx/SDx/2018.2/bin/vpl --iprepo /home/user/CHaiDNN3/design/build/_sds/iprepo/repo --iprepo /opt/Xilinx/SDx/2018.2/data/ip/xilinx --platform /opt/Xilinx/SDx/2018.2/platforms/zc702/zc702.xpfm --temp_dir /home/user/CHaiDNN3/design/build/_sds/p0 --output_dir /home/user/CHaiDNN3/design/build/_sds/p0/vpl --input_file /home/user/CHaiDNN3/design/build/_sds/p0/.xsd/top.bd.tcl --target hw --save_temps --kernels XiConvolutionTop:adapter --webtalk_flag SDSoC --xp "param:compiler.skipTimingCheckAndFrequencyScaling=1" --xp "vivado_prop:run.impl_1.{STEPS.OPT_DESIGN.ARGS.MORE OPTIONS}={-directive Explore}" --xp "vivado_prop:run.impl_1.{STEPS.PLACE_DESIGN.ARGS.MORE OPTIONS}={-directive Explore}" --xp "vivado_prop:run.impl_1.STEPS.PHYS_OPT_DESIGN.IS_ENABLED=1" --xp "vivado_prop:run.impl_1.{STEPS.PHYS_OPT_DESIGN.ARGS.MORE OPTIONS}={-directive Explore}" --xp "vivado_prop:run.impl_1.{STEPS.ROUTE_DESIGN.ARGS.MORE OPTIONS}={-directive Explore}" --xp "vivado_prop:run.synth_1.{STEPS.SYNTH_DESIGN.TCL.PRE}={/home/user/CHaiDNN3/design/build/../conv/scripts/mcps.tcl}" --xp "vivado_prop:run.impl_1.{STEPS.PLACE_DESIGN.TCL.PRE}={/home/user/CHaiDNN3/design/build/../conv/scripts/mcps.tcl}" --xp "param:compiler.deleteDefaultReportConfigs=false" ' sds++ log file saved as /home/user/CHaiDNN3/design/build/_sds/reports/sds.log ERROR: [SdsCompiler 83-5004] Build failed

Makefile:147: recipe for target 'libxlnxdnn.so' failed make: *** [libxlnxdnn.so] Error 1

Is there a particular configuration, of weights or other, that allows me to synthesize and implement at least the convolution accelerator on this Zynq device?

Thanks, Sara

saraballeri commented 5 years ago

Hi, can anyone help me?

Thanks, Sara

VishalX commented 5 years ago

@saraballeri,

The current version of CHaiDNN doesn't have support for zc702 device as it has fewer resources.

We are going to make a release soon with the support for Zynq702 device. I can't provide you with an exact date but it will be coming out probably around mid-December.

Thanks!

saraballeri commented 5 years ago

@VishalX , ok maybe I'll be waiting. Anyway, is there a documentation, paper or whatever that explains how the framework works in more detail? I've seen opcode, XGraph class, kernel, scheduler and more, but from code it is very difficult to understand how CHaiDNN works.

Thanks, Sara

VishalX commented 5 years ago

@saraballeri,

Unfortunately, there is no documentation right now on CHaiDNN framework. But if you have specific questions on something, please post them and we'll try to address them in the best way we can.

saraballeri commented 5 years ago

Hi @VishalX , I'm going to open a separate issue to make a question about the CHaiDNN workflow. Thanks.

saraballeri commented 5 years ago

Hi @VishalX, do you have any update about the release of CHaiDNN with the support for Zynq702 devices? Thank you.

Sara

tuanho27 commented 5 years ago

Hi, While looking for the new release with support small resource. I tried to rebuild the project for ZC706, as I see the resource of this board is not totally different as compare with ZCU102, except for DSP. I finished building the "libxlnxdnn.so" with the new dsa platform of zc706 (Build CHaiDNN Hardware phase), but when it comes to an example (Alexnet, VGG), errors generated even the example has been included sds_lib.h and xi_interface.hpp.

Compiling AlexNet

arm-linux-gnueabihf-g++ -std=c++11 -DSDSOC=1 -Wno-write-strings -DAPIMODE=1 -DXI_DIET_CHAI_Z=0 -DXI_DIET_CHAI_ZUPLUS=0 -DPOOL_ENABLE=1 -DDECONV_ENABLE=1 -D HLS_NO_XIL_FPO_LIB -mfpu=neon -L/home/tuanho/Work/references/CHaiDNN/SD_Card/protobuf/arm32/lib -I/home/tuanho/Work/references/CHaiDNN/SD_Card/protobuf/arm32/include -L/home/tuanho/Work/references/CHaiDNN/SD_Card/opencv/arm32/lib -I/home/tuanho/Work/references/CHaiDNN/SD_Card/opencv/arm32/include -I/home/tuanho/Work/references/CHaiDNN/SD_Card/opencv/arm32/include -I/home/tuanho/Xilinx_SDSoC/SDx/2018.2/target/aarch32-linux/include -I/home/tuanho/Xilinx_SDSoC/SDx/2018.2/../../Vivado/2018.2/include -L/home/tuanho/Work/references/CHaiDNN/SD_Card/lib -L/home/tuanho/Work/references/CHaiDNN/SD_Card/cblas/arm32/lib -lopencv_core -lopencv_imgproc -lopencv_imgcodecs -ldl -lrt -lpthread ../example/alexnet_ex.cpp -o alexnet.elf /tmp/ccBGY8Jp.o: In function execRoutine(void*)': resnet50_ex.cpp:(.text+0x60): undefined reference toxiExec(void, std::vector<void, std::allocator<void> >, std::vector<void, std::allocator<void> >)' /tmp/ccBGY8Jp.o: In function main': resnet50_ex.cpp:(.text+0x580): undefined reference toxiInit(char, char, char, _io_layer_info, int, bool, std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::cxx11::basic_string<char, std::char_traits, std::allocator >)' resnet50_ex.cpp:(.text+0x604): undefined reference to `xiInit(char, char, char, _io_layer_info, int, bool, std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::cxx11::basic_string<char, std::char_traits, std::allocator >)' resnet50_ex.cpp:(.text+0x888): undefined reference to `inputNormalization(std::vector<void, std::allocator<void> >, int, int, char, char, bool, float, float, int, _io_layer_info)' resnet50_ex.cpp:(.text+0x96c): undefined reference to sds_alloc_non_cacheable' resnet50_ex.cpp:(.text+0xa4c): undefined reference toxiInputRead(std::vector<void, std::allocator<void> >, std::vector<void, std::allocator<void> >, int, _io_layer_info)' resnet50_ex.cpp:(.text+0xb18): undefined reference to sds_alloc_non_cacheable' resnet50_ex.cpp:(.text+0xb2c): undefined reference tosds_alloc_non_cacheable' resnet50_ex.cpp:(.text+0xc84): undefined reference to sds_alloc_non_cacheable' resnet50_ex.cpp:(.text+0xc98): undefined reference tosds_alloc_non_cacheable' resnet50_ex.cpp:(.text+0xf54): undefined reference to sds_clock_counter' resnet50_ex.cpp:(.text+0x1074): undefined reference tosds_clock_counter' resnet50_ex.cpp:(.text+0x1094): undefined reference to sds_clock_frequency' resnet50_ex.cpp:(.text+0x12c0): undefined reference toxiUnpackOutput(std::vector<void, std::allocator<void> >, std::vector<void, std::allocator<void> >, _kernel_type, int, int)' resnet50_ex.cpp:(.text+0x132c): undefined reference to `xiUnpackOutput(std::vector<void, std::allocator<void> >, std::vector<void, std::allocator<void> >, _kernel_type, int, int)' resnet50_ex.cpp:(.text+0x13a0): undefined reference to `outputWrite(char, char, std::vector<void, std::allocator<void> >, int, _io_layer_info, int)' resnet50_ex.cpp:(.text+0x1414): undefined reference to `outputWrite(char, char, std::vector<void, std::allocator<void> >, int, _io_layer_info, int)' resnet50_ex.cpp:(.text+0x1434): undefined reference to `xiRelease(void)' resnet50_ex.cpp:(.text+0x143c): undefined reference to xiRelease(void*)' resnet50_ex.cpp:(.text+0x14a4): undefined reference tosds_free' resnet50_ex.cpp:(.text+0x1524): undefined reference to sds_free' resnet50_ex.cpp:(.text+0x159c): undefined reference tosds_free' resnet50_ex.cpp:(.text+0x1614): undefined reference to sds_free' resnet50_ex.cpp:(.text+0x168c): undefined reference tosds_free' collect2: error: ld returned 1 exit status

Bwt, in my case, I need to add -lpthread and -mfpu=neon to avoid other errors. Have you tested with this?

Best Regards, TuanH

VishalX commented 5 years ago

Hi @tuanho27

For sds_* calls, you have to include correct sds_lib.h. Same goes for linking the sds calls. In Makefile, update the variables for aarch32. For example,

ARM_INC := $(SDx_BUILD_PATH)/target/aarch64-linux/include
to
ARM_INC := $(SDx_BUILD_PATH)/target/aarch32-linux/include

I'm not sure of the correct directory name, please check in sdx installation folders.

tuanho27 commented 5 years ago

Hi @VishalX ,

Thank you for your reply. I already adjusted the link to SDx for ARM-32 before execute make, you can see it in my log file above. I met this error before with SDSoC GUI env for another pj... and fixed this. (I/home/tuanho/Xilinx_SDSoC/SDx/2018.2/target/aarch32-linux/include -)

That the reason makes me confused. I thought it has to run smoothly this step and the error will be generated in the implementing step for over resource (if any). Still don't know the reason, and besides all API functions have errors too even the interface header added. However, I will check to find out if the Makefile for ARM-32 already checked by your work.

Regards,

tuanho27 commented 5 years ago

Hi,

Well, after wandering around and tried to fix the errors. I re-checkout the repo again and rebuild from the beginning, include new platform (*.dsa), this time fortunately by somehow, the build process of example works. That's surprise. By now I can test some pre-trained models successfully :) Besides, because when I tried to put all layers (conv, deconv, pool) to HW, the resource is over as I guess. I then used the option DIET_CHAI_Z = 1, everything works now, I attached some images

Over resource on ZC706: selection_146

Build successful with DIET_CHAI_Z = 1 selection_147

Testing googlenet: selection_148

Thank your team for great work , Regards,

averr5 commented 5 years ago

@VishalX I am trying to build hardware for DIET_CHAI_Z targeting zedboard. I get the following errors :

===>The following messages were generated while processing /home/ave/Desktop/CHaiDNN/design/build/_sds/p0/vivado/prj/prj.runs/impl_1 : ERROR: [VPL 30-640] Place Check : This design requires more RAMB36/FIFO cells than are available in the target device. This design requires 162 of such cell types but only 140 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device. ERROR: [VPL 30-640] Place Check : This design requires more RAMB18 and RAMB36/FIFO cells than are available in the target device. This design requires 350 of such cell types but only 280 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device. ERROR: [VPL 30-640] Place Check : This design requires more RAMB36E1 cells than are available in the target device. This design requires 162 of such cell types but only 140 compatible sites are available in the target device. Please analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device. ERROR: [VPL 30-99] Placer failed with error: 'Implementation Feasibility check failed, Please see the previously displayed individual error or warning messages for more details.' Please review all ERROR and WARNING messages during placement to understand the cause for failure. ERROR: [VPL 17-69] Command failed: Placer could not place all instances ERROR: [VPL 60-704] Integration error, problem implementing dynamic region, place_design ERROR, please look at the run log file '/home/ave/Desktop/CHaiDNN/design/build/_sds/p0/vivado/prj/prj.runs/impl_1/runme.log' for more information ERROR: [VPL 60-806] Failed to finish platform linker ERROR: [SdsCompiler 83-5019] Exiting sds++ : Error when calling '/opt/Xilinx/SDx/2018.3/bin/vpl --iprepo /home/ave/Desktop/CHaiDNN/design/build/_sds/iprepo/repo --iprepo /opt/Xilinx/SDx/2018.3/data/ip/xilinx --platform /home/ave/Desktop/CHaiDNN/zed/zed.xpfm --temp_dir /home/ave/Desktop/CHaiDNN/design/build/_sds/p0 --output_dir /home/ave/Desktop/CHaiDNN/design/build/_sds/p0/vpl --input_file /home/ave/Desktop/CHaiDNN/design/build/_sds/p0/.xsd/top.bd.tcl --target hw --save_temps --kernels XiConvolutionTop:adapter --webtalk_flag SDSoC --xp "param:compiler.skipTimingCheckAndFrequencyScaling=1" --xp "vivado_prop:run.impl_1.{STEPS.OPT_DESIGN.ARGS.MORE OPTIONS}={-directive Explore}" --xp "vivado_prop:run.impl_1.{STEPS.PLACE_DESIGN.ARGS.MORE OPTIONS}={-directive Explore}" --xp "vivado_prop:run.impl_1.STEPS.PHYS_OPT_DESIGN.IS_ENABLED=1" --xp "vivado_prop:run.impl_1.{STEPS.PHYS_OPT_DESIGN.ARGS.MORE OPTIONS}={-directive Explore}" --xp "vivado_prop:run.impl_1.{STEPS.ROUTE_DESIGN.ARGS.MORE OPTIONS}={-directive Explore}" --xp "vivado_prop:run.synth_1.{STEPS.SYNTH_DESIGN.TCL.PRE}={/home/ave/Desktop/CHaiDNN/design/conv/scripts/mcps.tcl}" --xp "vivado_prop:run.impl_1.{STEPS.PLACE_DESIGN.TCL.PRE}={/home/ave/Desktop/CHaiDNN/design/conv/scripts/mcps.tcl}" --xp "param:compiler.deleteDefaultReportConfigs=false" ' sds++ log file saved as /home/ave/Desktop/CHaiDNN/design/build/_sds/reports/sds.log ERROR: [SdsCompiler 83-5004] Build failed

Makefile:147: recipe for target 'libxlnxdnn.so' failed make: *** [libxlnxdnn.so] Error 1

I get that the resources on zedboard are not sufficient. Is there any way out ?

xiaobeiyan commented 2 years ago

Hi,

Well, after wandering around and tried to fix the errors. I re-checkout the repo again and rebuild from the beginning, include new platform (*.dsa), this time fortunately by somehow, the build process of example works. That's surprise. By now I can test some pre-trained models successfully :) Besides, because when I tried to put all layers (conv, deconv, pool) to HW, the resource is over as I guess. I then used the option DIET_CHAI_Z = 1, everything works now, I attached some images

Over resource on ZC706: selection_146

Build successful with DIET_CHAI_Z = 1 selection_147

Testing googlenet: selection_148

Thank your team for great work , Regards,

Hi @tuanho27 thanks for the comment. I am also using zc706 to build the network in sdx. I am facing errors such as ERROR: [DMAnalysis 83-4416] Specified sys_port: ps_e_S_AXI_HP0_FPD for in1 cannot be found in the platform! ERROR: [DMAnalysis 83-4416] Specified sys_port: ps_e_S_AXI_HP1_FPD for in2 cannot be found in the platform! ERROR: [DMAnalysis 83-4416] Specified sys_port: ps_e_S_AXI_HP2_FPD for out1 cannot be found in the platform! I was wondering if you faced the same error and if you could give me some help? thanks!