Xilinx / Vitis_Libraries

Vitis Libraries
https://docs.xilinx.com/r/en-US/Vitis_Libraries
Apache License 2.0
894 stars 356 forks source link

BLAS multikernel benchmark example doesn't work on U250 platform #193

Open fiannone opened 10 months ago

fiannone commented 10 months ago

I'd like to inform you that the VITIS Library example of GEMM multi kernels fail using 4 and 2 kernels targeting the Alveo U250 board compiling with VITIS 2022.2 and 2023.1 (platform xilinx_u250_gen3x16_xdma_4_1_202210_1) . The same example using target the hw_emu works whilst the target hw fails with the following error:

21:18:39] Run vpl: FINISHED. Run Status: impl ERROR

===>The following messages were generated while Compiling (synthesis checkpoint) kernel/IP: ulp_m01_regslice_3 Log file: /afs/enea.it/fra/user/iannone/.Xilinx/Vitis/2022.2/Vitis_Libraries/BLAS_bench/L3/benchmarks/gemm/memKernel/_x_temp.hw.xilinx_u250_gen3x16_xdma_4_1_202210_1/link/vivado/vpl/prj/prj.runs/ulp_m01_regslice_3_synth_1/runme.log : ERROR: [VPL 17-356] Failed to install all user apps.

===>The following messages were generated while processing /afs/enea.it/fra/user/iannone/.Xilinx/Vitis/2022.2/Vitis_Libraries/BLAS_bench/L3/benchmarks/gemm/memKernel/_x_temp.hw.xilinx_u250_gen3x16_xdma_4_1_202210_1/link/vivado/vpl/prj/prj.runs/impl_1 : ERROR: [VPL 30-487] The packing of instances into a set of CLBs defined by a pblock constraint could not be obeyed. There are a total of 25680 CLBs in the pblock, of which 20686 CLBs are available, however, the unplaced instances require 23494 CLBs. The unavailable CLBs are either taken by placed instances or are blocked due to exclude placement constraints. Please analyze your design to determine if the pblock can be resized or the number of LUTs, FFs, and/or control sets can be reduced.

Number of control sets and instances constrained to the pBlock Control sets: 2885 Luts: 232127 (combined) 265463 (total), available capacity: 205440 Flip flops: 365122, available capacity: 410880 NOTE: each CLB can only accommodate up to 4 unique control sets so FFs cannot be packed to fully fill every CLB

To attempt placement at higher effort levels at the expense of runtime, please use the following tcl command, setting the value of limit to 2000 or more. set_param place.sliceLegEffortLimit limit

My feeling is that is very serious as an example developed by Xilinx developers doesn't work. It's better remove it from GitHub repository in order to avoid my wasting time. Furthermore I'd like to use the tcl command to setting a new limit of place.sliceLegEffortLimit but I don't know how to open a tclsh consolle in the makefile parameters of the GEMM Bias Vitis library example.

Finally I'd like to inform you that I and my Colleague Paolo are involving in the EUROHPC TEXTROSSA project and we are going to deliver a report on which we'll inform the developers EUROHPC community about the bad support by AMD Xilinx on an example developed by AMD that doesn't work.

afzalxo commented 8 months ago

I would like to agree with the gentleman above.