Open Ali-Flt opened 2 years ago
The AIE_METADATA err: -22
can be safely ignored. It's an XRT code issue.
The wrong result can be caused by many reasons. Checking with Vitis-AI experts.
I went through all 4 steps of the tutorial again without changing anything. The only change was that I used Vitis AI branch 1.3.2 instead of V1.3 and still the results are incorrect.
This is getting really frustrating for me because I'm doing exactly what the tutorial tells me to do, I don't even get any errors but the model is not giving proper outputs.
I also installed Vitis 2021.2 and tried the whole flow with Vitis AI 1.4 (master). But the results were the same.
I forgot to mention that instead of loading the SD Card using the sd_card.img file, I extracted the zcu104_custom_plnx/images/linux/rootfs.tar.gz file in the second partition and copied all the files in the dpu_trd_system/Hardware/package/sd_card/ folder into the first SD card partition (boot partition). I assume this shouldn't change the application result. right?
I'm suspecting that because I have downloaded the vitis ai git separately and added its path to Vitis instead of letting Vitis download it itself the dpu trd application template files are not loaded properly. because I've done every other thing exactly like the tutorial. Could this be the cause of the issue? Here are list of the warnings I get after Vitis application project has been built successfully:
Also here is the error I get when I try to download the Vitis AI library in the Vitis IDE:
I've tried downloading other gits as libraries in Vitis IDE without error but there seem to be a problem with Vitis AI Git. Please tell me how to fix this error so that I can see if the issue is caused by the manual git download.
OS : Ubuntu 18.04 LTS Vitis 2021.2 Vitis AI 1.4 (master)
Also I found out another problem. When I create the DPU Kernel application project using the Xilinx Official Vitis platform for ZCU104, the system builds without errors and resnet model works too. There are some differences in the application project when I create it on top of the platform provided by xilinx and when I create it on top of my own custom platform. For example in the section below, the hw_link configurations are loaded automatically in the first case but not in the second case.
Why are such configurations not loaded in the application project on my custom platform? Can you tell me where the script for generating the application project files is located? and what could cause some files not to be loaded?
The v++ configuration settings is set by https://github.com/Xilinx/Vitis-AI/blob/v1.3/dsa/DPU-TRD/prj/Vitis/config_file/prj_config_gui and this file is associated to the application project by https://github.com/Xilinx/Vitis-AI/blob/v1.3/dsa/DPU-TRD/description.json
In description.json
, the "ldclflags" : "--config PROJECT/src/prj/Vitis/config_file/prj_config_gui"
is set under platform_properties
-> zcu104_base
.
To workaround this issue, you can do any of the following
description.json
locallyIf you update the description.json, it can be something like this:
"containers": [
{
"accelerators": [
{
"kernel_type": "user",
"name": "DPUCZDX8G",
"num_compute_units" : "2",
"build_command" : "$(VIVADO) -mode batch -source PROJECT/src/prj/Vitis/scripts_gui/gen_dpu_xo.tcl -tclargs $(PROJECT) $@ $(KERNEL_NAME) $(TARGET) $(DEVICE) $(XSA)",
"clean_command" : "rm -rf *.log *.jou *.xo packaged_* tmp_kernel_*",
"dependencies" : [
"src/prj/Vitis/kernel_xml/dpu/kernel.xml",
"src/prj/Vitis/scripts_gui/package_dpu_kernel.tcl",
"src/prj/Vitis/scripts_gui/gen_dpu_xo.tcl",
"src/prj/Vitis/dpu_conf.vh",
"src/dpu_ip/Vitis/dpu/hdl/DPUCZDX8G.v",
"src/dpu_ip/Vitis/dpu/inc/arch_def.vh",
"src/dpu_ip/Vitis/dpu/xdc/timing_clocks.xdc",
"src/dpu_ip/DPUCZDX8G_v3_3_0/ttcl/fingerprint_json.ttcl",
"src/dpu_ip/DPUCZDX8G_v3_3_0/hdl/DPUCZDX8G_v3_3_0_vl_dpu.sv",
"src/dpu_ip/DPUCZDX8G_v3_3_0/inc/function.vh",
"src/dpu_ip/DPUCZDX8G_v3_3_0/inc/arch_para.vh"
]
},
{
"kernel_type": "user",
"name": "sfm_xrt_top",
"build_command" : "$(VIVADO) -mode batch -source PROJECT/src/prj/Vitis/scripts_gui/gen_sfm_xo.tcl -tclargs $(PROJECT) $@ $(KERNEL_NAME) $(TARGET) $(DEVICE) $(XSA)",
"dependencies" : [
"src/prj/Vitis/kernel_xml/sfm/kernel.xml",
"src/prj/Vitis/scripts_gui/package_sfm_kernel.tcl",
"src/prj/Vitis/scripts_gui/gen_sfm_xo.tcl",
"src/dpu_ip/Vitis/sfm/hdl/sfm_xrt_top.v",
"src/dpu_ip/DPUCZDX8G_v3_3_0/hdl/DPUCZDX8G_v3_3_0_vl_sfm.sv",
"src/dpu_ip/DPUCZDX8G_v3_3_0/xci/sfm/fp_acc/fp_acc.xci",
"src/dpu_ip/DPUCZDX8G_v3_3_0/xci/sfm/fp_add/fp_add.xci",
"src/dpu_ip/DPUCZDX8G_v3_3_0/xci/sfm/fp_convert/fp_convert.xci",
"src/dpu_ip/DPUCZDX8G_v3_3_0/xci/sfm/fp_div/fp_div.xci",
"src/dpu_ip/DPUCZDX8G_v3_3_0/xci/sfm/fp_exp/fp_exp.xci"
]
}
],
"name": "dpu",
"ldclflags" : "--config PROJECT/src/prj/Vitis/config_file/prj_config_gui"
}
],
I have reported this issue before but the fix hasn't been applied yet. Sorry for this gap in the tutorial.
Hi @imrickysu,
Thanks for the answer, I really appreciate your quick answers to my comments. Yes today after I posted the last comment, I searched in the files and found the thing you mentioned in the .json file. And I was really shocked of the fact that the project's behavior depends on your platform's name. Please at least mention this in the tutorial.
But even with having zcu104_base in the platform's name, the resnet from the application project on my custom platform is not working.
root@zcu104_custom_plnx:~# env LD_LIBRARY_PATH=samples/lib XLNX_VART_FIRMWARE=/media/sd-mmcblk0p1/dpu.xclbin ./dpu bellpeppe-994958.JPEG
score[37] = 0.0396331 text: box turtle, box tortoise,
score[117] = 0.0396331 text: chambered nautilus, pearly nautilus, nautilus,
score[121] = 0.0396331 text: king crab, Alaska crab, Alaskan king crab, Alaska king crab, Paralithodes camtschatica,
score[149] = 0.0396331 text: dugong, Dugong dugon,
score[85] = 0.0396331 text: quail,
I'm really frustrated at this moment after running this tutorial on many different conditions several times, so please tell me if you have any idea why the model may not work. Or a way to debug the DPU cores' behavior. should I change anything in the petalinux project/vitis? The worst part is that the Vitis project builds without any errors, giving no clues about where the issue may be.
Obviously this tutorial has not been tested on the current version of Vitis and Vitis AI, so please test it, find the issues and update the tutorial.
Hi @Ali-Flt , I reran the VAI test for 2020.2. It worked well on my side.
Could you try to create the Vitis-AI application with the platform generated by the Makefile ? You can run make all
and generate the platform.
For the configuration setting issue we discussed above, the tutorial Step 5 (Update system_hw_link for proper kernel instantiation) considered this issue and provided the method to overcome the descrption.json setting specific the platform name.
Hi @imrickysu , Thanks for going through the tutorial for verification. I used the Makefile as you explained on Vitis 2021.2 to run all steps and the resnet app is working successfully now:
So after that I went to the make scripts in each step and looked for differences with the tutorial. Please read the differences I found and update the tutorial, because one of them is probably the cause of the platform not working. Step 1:
The clock IDs were not changed to start from 0 as mentioned in the tutorial:
In exporting the XSA, the platform name was in uppercase:
The locked signal of the clocking wizard was not connected to the processing system reset IP cores as mentioned in the tutorial.
I couldn't find any other differences but I may have missed something. I also didn't check the PS's configurations for any mismatch.
Step 2:
line 42: echo 'CONFIG_YOCTO_MACHINE_NAME="zcu104-zynqmp"' >> $(PETALINUX_CONFIG)
line 44: echo "CONFIG_YOCTO_BUILDTOOLS_EXTENDED=y" >> $(PETALINUX_CONFIG)
line 76: cd $(PETALINUX_DIR) && petalinux-package --boot --u-boot
line 80: cd $(PETALINUX_DIR) && petalinux-package --sysroot
(Note that the rootfs configs were different too but I believe the problem is not hidden in the rootfs because I ran the test with my own generated rootfs without issues so I didn't mention the rootfs differences.)
Step 3: The platform is generated with this script so I don't exactly know the differences with the GUI Flow, but I think the main one is that the domain name in the script is set to "xrt" but in the GUI flow it is "linux on psu_cortexa53".
I did the last step (running the Vitis AI demo) exactly like before in the GUI so either the error lies in the things I mentioned above, or the behavior of VIVADO/Vitis GUI flow is not as expected and is not the same as the VIVADO/xsct script flow.
Thanks again for solving the issue for me by your suggestion and I hope this info helps in finding and fixing the issue in the tutorial.
Hi, I've gone through this tutorial with Vitis 2020.2 and Vitis AI v1.3 : https://github.com/Xilinx/Vitis-Tutorials/tree/2020.2/Vitis_Platform_Creation/Introduction/02-Edge-AI-ZCU104
With some slight differences:
git clone https://github.com/Xilinx/Vitis-AI.git
git checkout v1.3
And added the repo to vitis like this:
Every other step and instruction was followed without error.
But when I run the Vitis-AI demo on the bell pepper image, I get this for the first run:
Notice there are some errors that I have no idea about the reason:
And I get this results for the runs after the first one:
As you can see the app runs without errors but the predictions are not correct at all. Any ideas why this could happen or how I can debug this?
Thanks