Xilinx / Vitis-Tutorials

Vitis In-Depth Tutorials
https://Xilinx.github.io/Vitis-Tutorials/
MIT License
1.21k stars 552 forks source link

06-fft2d_AIEvsHLS/AIE - Failure to Meet Timing As Specified in Documentation #378

Open mengstro opened 1 year ago

mengstro commented 1 year ago

Path to the Tutorial: AI_Engine_Development/Design_Tutorials/06-fft2d_AIEvsHLS/AIE

Hi, everyone! I'm having trouble achieving the same timing figures for the "cint16" datatype configuration (table for reference here). I've had no issues with reproducing the "cfloat" latencies, but for some reason, the "cint16" results match the "cfloat" results. For example, we were able to show that the latency per 2DFFT operation for the "cfloat" design was around 138μs (table shows ~142μs, which is close enough), but when rerunning the makefile flow for the "cint16" design produced essentially the same latency of 138μs (corresponding table shows that we should have expected ~50μs).

These designs were deployed to real hardware as opposed to running in hardware emulations. I did see the "Known Issues" passage addressing a "scaling issue" associated with the timestamp data, but assuming the scaling remains constant, we would've still expected a ~50% reduction in latency from the "cfloat" design to the "cint16" one.

Now, I did check to ensure that the design really was set to "cint16" and indeed it was - all design files were compiled with the datatype variable being set to "cint16." I did notice that the datamover's source code had some if/else statements that assumed the datatype variable in the makefile was either a zero or one, however all other source code files assumed it was either the word "cfloat" or "cint16," however I don't think this is what caused our timing issue

Just to go over how I produced my timing results, here is a step-wise set of procedures (assuming all required software was properly installed, like Vitis/Vivado, XRT, etc.):

  1. Set values for makefile recipe variables: TARGET=hw FFT_2D_DT=cint16 FFT_2D_PT=256 FFT_2D_INSTS=1 ITER_CNT=16 EN_TRACE=1 LAUNCH_HW_EMU_EXEC=1 (in case I wanted to run a hardware emulation)
  2. Changed PLATFORM to point my ADM-PA100 platform (confirmed working) and manually set XLNX_VERSAL to point to my custom petalinux installation path
  3. Ran the makefile for "sd_card" recipe and waited for it to finish
  4. Burn the SD card image onto a card and put it in the ADM-PA100
  5. Rebooted the ADM-PA100 and connected via serial to view the boot process and autologin
  6. Executed the following in the command line: mkdir /mnt/sd_mmcblk0p1 mount /dev/mmcblk0p1 /mnt/sd_mmcblk0p1 cd /mnt/sd_mmcblk0p1 ./run_script.sh
  7. Ejected the card and opened the "xrt.run_summary" file produced in the Vitis Analyzer

After going through the above procedure, I got the following timeline trace from the run summary file: Screenshot from 2023-03-30 10-55-57

Here are my work environment OS/program/devices:

Any help would be greatly appreciated - please let me know if more info is needed!

imrickysu commented 1 year ago

Hi @SURUTHI1605 , could you help with this quesiton?

SURUTHI1605 commented 1 year ago

Hi @imrickysu , I'll check the files and come back.

SURUTHI1605 commented 1 year ago

Please share information on the below questions.

  1. Which Vitis version you are using?
  2. Which github branch you have cloned to reproduce the results?
mengstro commented 1 year ago

Hi @SURUTHI1605,

I've been using Vitis 2022.1. Per the branch, I've been using the 2022.1 version. I tried using the 2022.2 version w/ Vitis 2022.2 but achieved the same result. The DSP library version that I used for building the applications was the version 2022.1 branch

All the best, Matt

SURUTHI1605 commented 1 year ago

Changes are pushed in https://github.com/Xilinx/Vitis-Tutorials/commit/366ebec74365295b8dbdbc6a9ad595a1fa3e3077. Please have a look into that.