Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.49k stars 629 forks source link

DPU IP run failed- vivado flow 2021.2 #1377

Open hudini87 opened 11 months ago

hudini87 commented 11 months ago

Hi all,

please help with error while running xdputils benchmark on resnet50.xmodel:

root@petalinux:~# xdputil benchmark /usr/sbin/models/resnet50.xmodel 1

WARNING: Logging before InitGoogleLogging() is written to STDERR F0309 12:37:45.220671 1130 dpu_controller_dnndk.cpp:190] Check failed: retval == 0 (-1 vs. 0) run dpu failed. Check failure stack trace: /usr/bin/xdputil: line 20: 1129 Aborted /usr/bin/python3 -m xdputil $*

Vivado+Petalinux 2021.2 DPU IP vivado flow with VITIS AI 2.0, KERNEL 5.10

Ultrascale MPSOC implementing dpuczdx8g 3.4 as required at compatibility table. DPU compatibility table

Unsuccessfully trying to run benchmark on resnet50.xmodel which has the same DPU fingerprint.

Vivado design implemented with 1X dpuczdx8g 1 core complying timing , as required and explained at https://github.com/Xilinx/Vitis-AI/tree/2.0/dsa/DPU-TRD/prj/Vivado and https://docs.xilinx.com/r/3.4-English/pg338-dpu/Introduction?tocId=~72l0MosWV8p9MbkDlnw8Q . DPU_IP_1_CORE

my pl.dtsi file as generated by vitis generated device tree:

DPU_dpuczdx8g_0: dpuczdx8g@8f000000 { / This is a place holder node for a custom IP, user may need to update the entries / clock-names = "s_axi_aclk", "dpu_2x_clk", "m_axi_dpu_aclk"; clocks = <&zynqmp_clk 71>, <&misc_clk_6>, <&misc_clk_7>; compatible = "xlnx,dpuczdx8g-3.4"; interrupt-names = "dpu0_interrupt"; interrupt-parent = <&gic>; interrupts = <0 95 4>; reg = <0x0 0x8f000000 0x0 0x1000000>; }; misc_clk_6: misc_clk_6 {

clock-cells = <0>;

clock-frequency = <499995000>; compatible = "fixed-clock"; }; misc_clk_7: misc_clk_7 {

clock-cells = <0>;

clock-frequency = <249997500>; compatible = "fixed-clock"; };

dpu ip probing as expected:

root@petalinux:~# dmesg | grep dpu [ 25.904014] dpu: loading out-of-tree module taints kernel. [ 25.941028] xlnx-dpu: Xilinx Deep Learning Processing Unit driver [ 25.957157] xlnx-dpu 8f000000.dpuczdx8g: Freq: axilite: 99 MHz, dpu: 249 MHz, dsp: 499 MHz [ 25.958075] xlnx-dpu 8f000000.dpuczdx8g: found 1 dpu @250MHz and 0 softmax, dpu registered as /dev/dpu successfully

root@petalinux:~# show_dpu

device_core_id=0 device= 0 core = 0 fingerprint = 0x1000020f6014407 batch = 1 full_cu_name=unknown:dpu0

root@petalinux:~# xdputil query { "DPU IP Spec":{}, "VAI Version":{ "libvart-runner.so":"Xilinx vart-runner Version: 2.0.0-d02dcb6041663dbc7ecbc0c6af9fafa087a789de 2023-12-05-15:57:46 ", "libvitis_ai_library-dpu_task.so":"Xilinx vitis_ai_library dpu_task Version: 2.0.0-d02dcb6041663dbc7ecbc0c6af9fafa087a789de 2022-01-20 07:11:10 [UTC] ", "libxir.so":"Xilinx xir Version: xir-d02dcb6041663dbc7ecbc0c6af9fafa087a789de 2023-09-26-17:06:07", "target_factory":"target-factory.2.0.0 d02dcb6041663dbc7ecbc0c6af9fafa087a789de" }, "kernels":[ { "DPU Arch":"DPUCZDX8G_ISA1_B4096_0101001FF6014407", "DPU Frequency (MHz)":250, "cu_addr":"0x8f000000", "cu_idx":0, "fingerprint":"0x101001ff6014407", "is_vivado_flow":true, "name":"DPU Core 0" } ] }

root@petalinux:~# xdputil status { "kernels":[ { "addrs_registers":{ "dpu0_base_addr_0":"0x0", "dpu0_base_addr_1":"0x0", "dpu0_base_addr_2":"0x0", "dpu0_base_addr_3":"0x0", "dpu0_base_addr_4":"0x0", "dpu0_base_addr_5":"0x0", "dpu0_base_addr_6":"0x0", "dpu0_base_addr_7":"0x0" }, "common_registers":{ "ADDR_CODE":"0x0", "CONV END":0, "CONV START":0, "HP_ARCOUNT_MAX":7, "HP_ARLEN":15, "HP_AWCOUNT_MAX":7, "HP_AWLEN":15, "LOAD END":0, "LOAD START":0, "MISC END":0, "MISC START":0, "PROF_NUM":0, "PROF_VALUE":0, "SAVE END":0, "SAVE START":0 }, "name":"DPU Registers Core 0" } ] }

the following command related to the written consequences below: root@petalinux:~# xdputil benchmark /usr/sbin/models/resnet50.xmodel 1

we get error:

WARNING: Logging before InitGoogleLogging() is written to STDERR F0309 12:37:45.220671 1130 dpu_controller_dnndk.cpp:190] Check failed: retval == 0 (-1 vs. 0) run dpu failed. Check failure stack trace: /usr/bin/xdputil: line 20: 1129 Aborted /usr/bin/python3 -m xdputil $*

xdputil status field are different now: root@petalinux:~# xdputil status { "kernels":[ { "addrs_registers":{ "dpu0_base_addr_0":"0x10300000", "dpu0_base_addr_1":"0x10c00000", "dpu0_base_addr_2":"0x12300000", "dpu0_base_addr_3":"0x10120000", "dpu0_base_addr_4":"0x0", "dpu0_base_addr_5":"0x0", "dpu0_base_addr_6":"0x0", "dpu0_base_addr_7":"0x0" }, "common_registers":{ "ADDR_CODE":"0x10200", "CONV END":2, "CONV START":2, "HP_ARCOUNT_MAX":7, "HP_ARLEN":15, "HP_AWCOUNT_MAX":7, "HP_AWLEN":15, "LOAD END":33, "LOAD START":33, "MISC END":0, "MISC START":0, "PROF_NUM":0, "PROF_VALUE":0, "SAVE END":0, "SAVE START":0 }, "name":"DPU Registers Core 0" } ] }

while writing dmesg | grep DPU we get timeout:

[ 198.640345] xlnx-dpu 8f000000.dpuczdx8g: xlnx_dpu_ioctl PID=1130 DPU=0 CPU=3 Comm=python3 waiting [ 201.770132] xlnx-dpu 8f000000.dpuczdx8g: cu[0] timeout [ 201.770142] xlnx-dpu 8f000000.dpuczdx8g: ------------[ cut here ]------------ [ 201.770146] xlnx-dpu 8f000000.dpuczdx8g: Dump DPU Registers: [ 201.770153] xlnx-dpu 8f000000.dpuczdx8g: TARGET_ID 01000020f6014407 [ 201.770158] xlnx-dpu 8f000000.dpuczdx8g: PMU_RST 000000ff [ 201.770163] xlnx-dpu 8f000000.dpuczdx8g: IP_VER_INFO 34000001 [ 201.770168] xlnx-dpu 8f000000.dpuczdx8g: IP_FREQENCY 000640fa [ 201.770172] xlnx-dpu 8f000000.dpuczdx8g: INT_STS 00000000 [ 201.770177] xlnx-dpu 8f000000.dpuczdx8g: INT_MSK 00000000 [ 201.770181] xlnx-dpu 8f000000.dpuczdx8g: INT_RAW 00000000 [ 201.770186] xlnx-dpu 8f000000.dpuczdx8g: INT_ICR 00000000 [ 201.770190] xlnx-dpu 8f000000.dpuczdx8g: [CU-0] [ 201.770195] xlnx-dpu 8f000000.dpuczdx8g: HPBUS 07070f0f [ 201.770199] xlnx-dpu 8f000000.dpuczdx8g: INSTR 00010200 [ 201.770204] xlnx-dpu 8f000000.dpuczdx8g: START 00000001 [ 201.770209] xlnx-dpu 8f000000.dpuczdx8g: ADDR0 0000000010300000 [ 201.770214] xlnx-dpu 8f000000.dpuczdx8g: ADDR1 0000000010c00000 [ 201.770218] xlnx-dpu 8f000000.dpuczdx8g: ADDR2 0000000012300000 [ 201.770223] xlnx-dpu 8f000000.dpuczdx8g: ADDR3 0000000010120000 [ 201.770228] xlnx-dpu 8f000000.dpuczdx8g: ADDR4 0000000000000000 [ 201.770233] xlnx-dpu 8f000000.dpuczdx8g: ADDR5 0000000000000000 [ 201.770238] xlnx-dpu 8f000000.dpuczdx8g: ADDR6 0000000000000000 [ 201.770243] xlnx-dpu 8f000000.dpuczdx8g: ADDR7 0000000000000000 [ 201.770247] xlnx-dpu 8f000000.dpuczdx8g: PSTART 00000000 [ 201.770252] xlnx-dpu 8f000000.dpuczdx8g: PEND 00000000 [ 201.770256] xlnx-dpu 8f000000.dpuczdx8g: CSTART 00000002 [ 201.770261] xlnx-dpu 8f000000.dpuczdx8g: CEND 00000002 [ 201.770265] xlnx-dpu 8f000000.dpuczdx8g: SSTART 00000000 [ 201.770269] xlnx-dpu 8f000000.dpuczdx8g: SEND 00000000 [ 201.770274] xlnx-dpu 8f000000.dpuczdx8g: LSTART 00000021 [ 201.770279] xlnx-dpu 8f000000.dpuczdx8g: LEND 00000021 [ 201.770284] xlnx-dpu 8f000000.dpuczdx8g: CYCLE 0000000012a7d61a [ 201.770288] xlnx-dpu 8f000000.dpuczdx8g: AXI 00555454 [ 201.770292] xlnx-dpu 8f000000.dpuczdx8g: [SOFTMAX] [ 201.770296] xlnx-dpu 8f000000.dpuczdx8g: ------------[ cut here ]------------

please help,

best regards

hudini87 commented 11 months ago

hi, can you help @qianglin-xlnx ? @quentonh

yyx32 commented 7 months ago

Hello @hudini87 , did your problem get resolved? Normally, the implementation of ResNet50 requires an SFM module. I see that the number of SFMs is 0 here. You can try changing the number of SFMs to 1. Also, can you explain in detail how your device tree is generated?