rapidstream-org / rapidstream-tapa

RapidStream TAPA compiles task-parallel HLS program into high-frequency FPGA accelerators.
https://tapa.rtfd.io
MIT License
155 stars 32 forks source link

No such option: --run-floorplanning #121

Closed enes1994 closed 2 months ago

enes1994 commented 1 year ago

Environment

I wanted to use floorplan option in my problem. I am also using CMake for the build system and there I have this code:

`cmake_minimum_required(VERSION 3.14)

project(GeMV)

set(PLATFORM xilinx_u280_xdma_201920_3 CACHE STRING "Target FPGA platform")

set(TOP GeMV)

set(CMAKE_CXX_FLAGS "${CMAKE_C_FLAGS} -Wno-write-strings")

find_package(gflags REQUIRED)

include(${CMAKE_CURRENT_SOURCE_DIR}/tapa/cmake/apps.cmake)

include(${CMAKE_CURRENT_SOURCE_DIR}/tapa/cmake/TAPACCConfig.cmake) include(${CMAKE_CURRENT_SOURCE_DIR}/tapa/cmake/FindSphinx.cmake)

add_subdirectory(src)

find_package(TAPA REQUIRED) find_package(SDx REQUIRED)

add_executable(GeMV) target_sources(GeMV PRIVATE host.cpp vadd.cpp)

target_link_libraries(GeMV PRIVATE ${TAPA} gflags src) target_link_libraries(GeMV PRIVATE tapa::tapa)

add_tapa_target( vadd-hw-xo --run-floorplanning --enable-hbm-binding-adjustment --floorplan-opt-priority SLR_CROSSING_PRIORITIZED INPUT vadd.cpp TOP Vadd CONNECTIVITY ${CMAKE_CURRENT_SOURCE_DIR}/connectivity.ini CONSTRAINT ${CMAKE_CURRENT_BINARY_DIR}/constraint.tcl --read-only-args A0 --read-only-args A1 --read-only-args A2 --read-only-args A3 --read-only-args A4 --read-only-args A5 --read-only-args A6 --read-only-args A7 --read-only-args A8 --read-only-args A9 --read-only-args A10 --read-only-args A11 --read-only-args A12 --read-only-args A13 --read-only-args A14 --read-only-args A15 --read-only-args A16 --read-only-args A17 --read-only-args A18 --read-only-args A19 --read-only-args A20 --read-only-args A21 --read-only-args A22 --read-only-args A23 --read-only-args A24 --read-only-args A25 --read-only-args A26 --read-only-args A27 --read-only-args x --write-only-args y PLATFORM ${PLATFORM})

add_xocc_hw_link_targets( ${CMAKE_CURRENT_BINARY_DIR} --config=${CMAKE_CURRENT_SOURCE_DIR}/connectivity.ini --vivado.synth.jobs 8 --vivado.prop=run.impl_1.STEPS.PHYS_OPT_DESIGN.IS_ENABLED=1 --vivado.prop=run.impl_1.STEPS.OPT_DESIGN.ARGS.DIRECTIVE=Explore --vivado.prop run.impl_1.STEPS.PLACE_DESIGN.ARGS.DIRECTIVE=EarlyBlockPlacement --vivado.prop=run.impl_1.STEPS.PHYS_OPT_DESIGN.ARGS.DIRECTIVE=Explore --vivado.prop run.impl_1.STEPS.ROUTE_DESIGN.ARGS.DIRECTIVE=Explore --vivado.prop=run.impl_1.STEPS.OPT_DESIGN.TCL.PRE=${CMAKE_CURRENT_BINARY_DIR}/constraint.tcl INPUT vadd-hw-xo HW_EMU_XCLBIN hw_emu_xclbin HW_XCLBIN hw_xclbin)

add_custom_target( vadd-cosim COMMAND $ --bitstream=$<TARGET_PROPERTY:${hw_emu_xclbin},FILE_NAME> DEPENDS GeMV ${hw_emu_xclbin} WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})

add_custom_target( vadd-hw COMMAND $ --bitstream=$<TARGET_PROPERTY:${hw_xclbin},FILE_NAME> DEPENDS GeMV ${hw_xclbin} WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})`

Since I couldnt find add_tapa_target custom function I wanted to follow the same way how you used it in serpens so I added "--run-floorplanning" same way. but I get this error:

`make vadd-hw Scanning dependencies of target vadd-hw-xo [ 8%] Generating Vadd.xilinx_u280_xdma_201920_3.hw.xo I1112 14:55:24.069 tapa.util:162] logging level set to INFO I1112 14:55:24.070 tapa.tapa:54] tapa version: 0.0.20220807.1 I1112 14:55:24.070 tapa.tapa:58] Python recursion limit set to 3000 Usage: tapa pack [OPTIONS] Try 'tapa pack --help' for help.

Error: No such option: --run-floorplanning make[3]: [CMakeFiles/vadd-hw-xo.dir/build.make:61: Vadd.xilinx_u280_xdma_201920_3.hw.xo] Error 2 make[2]: [CMakeFiles/Makefile2:211: CMakeFiles/vadd-hw-xo.dir/all] Error 2 make[1]: [CMakeFiles/Makefile2:191: CMakeFiles/vadd-hw.dir/rule] Error 2 make: [Makefile:157: vadd-hw] Error 2`

This is actually a simple matrix vector multiplication case. I split a matrix to 28 HBMs as you can see I want to make them read only arguments.

Am I missing something?

Licheng-Guo commented 1 year ago

Can you try this example and see if it works? https://github.com/UCLA-VAST/tapa/blob/release/apps/bandwidth/run_tapa.sh

enes1994 commented 1 year ago

After I run the run_tapa.sh here is the output: ./run_tapa.sh I1112 18:36:14.629 tapa.util:162] logging level set to INFO I1112 18:36:14.630 tapa.tapac:482] tapa version: 0.0.20220807.1 I1112 18:36:14.630 tapa.tapac:486] Python recursion limit set to 3000 I1112 18:36:14.630 tapa.tapac:406] Executing all steps of tapac W1112 18:36:14.630 tapa.tapac:428] The floorplan option is automatically enabled because a floorplan output file is provided I1112 18:36:14.701 tapa.tapac:541] added vendor include path/mnt/data/tools/Xilinx/Vitis_HLS/2021.1/include` I1112 18:36:16.186 tapa.core:175] extracting HLS C++ files I1112 18:36:16.188 tapa.core:197] running HLS I1112 18:36:16.192 tapa.core:233] spawn 256 workers for parallel HLS synthesis of the tasks I1112 18:36:31.239 tapa.core:248] extracting RTL files I1112 18:36:31.246 tapa.core:279] parsing RTL files and populating tasks I1112 18:36:32.334 tapa.core:296] instrumenting upper-level RTL I1112 18:36:32.334 tapa.core:331] Running floorplanning W1112 18:36:32.335 tapa.task_graph:75] chan_0 is assumed to be both read from and written to. If not, please use --read-only-args or --write-only-args for better optimization results. W1112 18:36:32.335 tapa.task_graph:75] chan_1 is assumed to be both read from and written to. If not, please use --read-only-args or --write-only-args for better optimization results. W1112 18:36:32.335 tapa.task_graph:75] chan_2 is assumed to be both read from and written to. If not, please use --read-only-args or --write-only-args for better optimization results. W1112 18:36:32.335 tapa.task_graph:75] chan_3 is assumed to be both read from and written to. If not, please use --read-only-args or --write-only-args for better optimization results.


Starting AutoBridge


Version: 0.0.20220512.dev1

Running details logged to /mnt/data/work/sti/benchmark/xilinx/mv_pc28_u280/tapa/apps/bandwidth/run/autobridge/autobridge-Nov-12-2022-18:36.log


*** CRITICAL WARNING: Gurobi solver not detected. Floorplanning may take extra time. The Gurobi solver is much faster than the open-source solver, and it is free for academia.

Generate task graph visualization in graphviz format: /mnt/data/work/sti/benchmark/xilinx/mv_pc28_u280/tapa/apps/bandwidth/run/autobridge/task_graph.dot

Floorplan parameters:

floorplan_strategy: HALF_SLR_LEVEL_FLOORPLANNING threshold for switching to iterative partitioning: 200 floorplan_opt_priority: AREA_PRIORITIZED min_area_limit: 0.650000 max_area_limit: 0.850000 min_slr_width_limit: 10000 max_slr_width_limit: 15000 max_search_time per solving: 600

Start floorplanning, please check the log for the progress...

The total area of the design: BRAM: 0 / 4032 = 0.0% DSP: 0 / 10368 = 0.0% FF: 4864 / 2730240 = 0.2% LUT: 13340 / 1365120 = 1.0% URAM: 0 / 1088 = 0.0%

total wire length: 0 SLR boundary 0 - 1 has 0 crossings SLR boundary 1 - 2 has 0 crossings SLR boundary 2 - 3 has 0 crossings Orbit 0: 0 1 2 3 Orbit 1: 4 5 6 7 Orbit 2: 8 9 10 11 Floorplan finishes

+----------------------+----------+---------+--------+---------+----------+ | Slot Name | BRAM (%) | DSP (%) | FF (%) | LUT (%) | URAM (%) | +----------------------+----------+---------+--------+---------+----------+ | CR_X4Y0_To_CR_X7Y3 | 0.0 | 0.0 | 0.5 | 2.7 | 0.0 | | CR_X4Y4_To_CR_X7Y7 | 0.0 | 0.0 | 0.5 | 2.7 | 0.0 | | CR_X4Y8_To_CR_X7Y11 | 0.0 | 0.0 | 0.5 | 2.7 | 0.0 | | CR_X4Y12_To_CR_X7Y15 | 0.0 | 0.0 | 0.5 | 2.7 | 0.0 | +----------------------+----------+---------+--------+---------+----------+

The device could be partitioned into 4 slots.

The number of wires between slots are:


AutoBridge Finishes


I1112 18:36:33.563 tapa.floorplan:77] generate the floorplan constraint at run/Bandwidth_floorplan.tcl I1112 18:36:33.563 tapa.core:422] top task register level set to 4 I1112 18:36:33.563 tapa.core:426] instrumenting top-level RTL I1112 18:36:33.648 tapa.core:430] generating report I1112 18:36:33.650 tapa.core:438] writing generated auxiliary RTL files I1112 18:36:33.650 tapa.core:446] packaging RTL code I1112 18:36:59.918 tapa.tapac:669] generate the v++ xo file at run/Bandwidth.xo I1112 18:36:59.918 tapa.bitstream:73] use the original connectivity configuration at /mnt/data/work/sti/benchmark/xilinx/mv_pc28_u280/tapa/apps/bandwidth/link_config.ini in the v++ script I1112 18:36:59.918 tapa.tapac:687] generate the v++ script at run/Bandwidth_generate_bitstream.sh`

But when I tried to run it with cmake I got this error:

ERROR: [CFGEN 83-2287] --sp tag applied with an invalid sp tag: DDR[2] ERROR: [CFGEN 83-2287] --sp tag applied with an invalid sp tag: DDR[3] ERROR: [CFGEN 83-2287] --sp tag applied with an invalid sp tag: DDR[2] ERROR: [CFGEN 83-2287] --sp tag applied with an invalid sp tag: DDR[3] ERROR: [CFGEN 83-2297] Please consult platforminfo for sptag information ERROR: [CFGEN 83-2298] Exiting due to previous error ERROR: [SYSTEM_LINK 82-36] [18:33:18] cfgen failed

So should I not use the CMake?

Licheng-Guo commented 1 year ago

If that example runs through, could you not use CMake and follow the example shell script? Thanks

enes1994 commented 1 year ago

If there is anything comes to your mind to change in my cmake file for my problem that would be great. It is kind of a big build system and I use many different packages.

enes1994 commented 1 year ago

If that example runs through, could you not use CMake and follow the example shell script? Thanks

Btw, I was able to run vadd example with cmake command.

Blaok commented 1 year ago

Since I couldnt find add_tapa_target custom function I wanted to follow the same way how you used it in serpens so I added "--run-floorplanning" same way. but I get this error:

Old CMake files are partially broken since 979e008b8d3ef220a7f3e00c9993e0b566123b1b due to the migration tapac CLI to tapa CLI. For the new tapa CLI, the order of arguments is significant; any additional argument will be for the tapa pack sub-command only.

A quick work-around is to move the list(APPEND tapa_cmd ${TAPA_UNPARSED_ARGUMENTS}) line right before list(APPEND tapa_cmd link) in /usr/lib/cmake/tapa/TAPACCConfig.cmake after installation.

Longer term, we should add ANALYZE_ARGS, COMPILE_ARGS, DSE_FLOORPLAN_ARGS, LINK_ARGS, OPTIMIZE_FLOORPLAN_ARGS, PACK_ARGS, and SYNTH_ARGS to the add_tapa_target function.

Blaok commented 1 year ago

ERROR: [CFGEN 83-2287] --sp tag applied with an invalid sp tag: DDR[2] ERROR: [CFGEN 83-2287] --sp tag applied with an invalid sp tag: DDR[3] ERROR: [CFGEN 83-2287] --sp tag applied with an invalid sp tag: DDR[2] ERROR: [CFGEN 83-2287] --sp tag applied with an invalid sp tag: DDR[3] ERROR: [CFGEN 83-2297] Please consult platforminfo for sptag information ERROR: [CFGEN 83-2298] Exiting due to previous error ERROR: [SYSTEM_LINK 82-36] [18:33:18] cfgen failed

This is because the CMakeList.txt was written for U250 which has 4 DDR channels, but you are running it for U280 which only has 2 DDR channels. You can replace DDR with HBM and try again.

enes1994 commented 1 year ago

Hi @Blaok thanks for the helpful answers. I replaced DDRs with HBMs as : --connectivity.sp=Bandwidth.m_axi_chan_0:HBM[0] --connectivity.sp=Bandwidth.m_axi_chan_1:HBM[1] --connectivity.sp=Bandwidth.m_axi_chan_2:HBM[2] --connectivity.sp=Bandwidth.m_axi_chan_3:HBM[3] in CMake file but now I received an error saying: ERROR: [VPL 101-2] ERROR: [Vivado 12-1041] No pblock specified to add instances to.

And for

`cmake_minimum_required(VERSION 3.14)

project(GeMV)

set(PLATFORM xilinx_u280_xdma_201920_3 CACHE STRING "Target FPGA platform")

set(TOP GeMV)

set(CMAKE_CXX_FLAGS "${CMAKE_C_FLAGS} -Wno-write-strings")

find_package(gflags REQUIRED)

include(${CMAKE_CURRENT_SOURCE_DIR}/tapa/cmake/apps.cmake)

include(${CMAKE_CURRENT_SOURCE_DIR}/tapa/cmake/TAPACCConfig.cmake) include(${CMAKE_CURRENT_SOURCE_DIR}/tapa/cmake/FindSphinx.cmake)

add_subdirectory(src)

find_package(TAPA REQUIRED) find_package(SDx REQUIRED)

add_executable(GeMV) target_sources(GeMV PRIVATE host.cpp vadd.cpp)

target_link_libraries(GeMV PRIVATE ${TAPA} gflags src) target_link_libraries(GeMV PRIVATE tapa::tapa)

add_tapa_target( vadd-hw-xo --run-floorplanning --enable-hbm-binding-adjustment --floorplan-opt-priority SLR_CROSSING_PRIORITIZED INPUT vadd.cpp TOP Vadd CONNECTIVITY ${CMAKE_CURRENT_SOURCE_DIR}/connectivity.ini CONSTRAINT ${CMAKE_CURRENT_BINARY_DIR}/constraint.tcl --read-only-args A0 --read-only-args A1 --read-only-args A2 --read-only-args A3 --read-only-args A4 --read-only-args A5 --read-only-args A6 --read-only-args A7 --read-only-args A8 --read-only-args A9 --read-only-args A10 --read-only-args A11 --read-only-args A12 --read-only-args A13 --read-only-args A14 --read-only-args A15 --read-only-args A16 --read-only-args A17 --read-only-args A18 --read-only-args A19 --read-only-args A20 --read-only-args A21 --read-only-args A22 --read-only-args A23 --read-only-args A24 --read-only-args A25 --read-only-args A26 --read-only-args A27 --read-only-args x --write-only-args y PLATFORM ${PLATFORM})

add_xocc_hw_link_targets( ${CMAKE_CURRENT_BINARY_DIR} --config=${CMAKE_CURRENT_SOURCE_DIR}/connectivity.ini --vivado.synth.jobs 8 --vivado.prop=run.impl_1.STEPS.PHYS_OPT_DESIGN.IS_ENABLED=1 --vivado.prop=run.impl_1.STEPS.OPT_DESIGN.ARGS.DIRECTIVE=Explore --vivado.prop run.impl_1.STEPS.PLACE_DESIGN.ARGS.DIRECTIVE=EarlyBlockPlacement --vivado.prop=run.impl_1.STEPS.PHYS_OPT_DESIGN.ARGS.DIRECTIVE=Explore --vivado.prop run.impl_1.STEPS.ROUTE_DESIGN.ARGS.DIRECTIVE=Explore --vivado.prop=run.impl_1.STEPS.OPT_DESIGN.TCL.PRE=${CMAKE_CURRENT_BINARY_DIR}/constraint.tcl INPUT vadd-hw-xo HW_EMU_XCLBIN hw_emu_xclbin HW_XCLBIN hw_xclbin)

add_custom_target( vadd-cosim COMMAND $ --bitstream=$<TARGET_PROPERTY:${hw_emu_xclbin},FILE_NAME> DEPENDS GeMV ${hw_emu_xclbin} WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})

add_custom_target( vadd-hw COMMAND $ --bitstream=$<TARGET_PROPERTY:${hw_xclbin},FILE_NAME> DEPENDS GeMV ${hw_xclbin} WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})`

I still get the same error and it still calls tapa pack sub command after I moved list(APPEND tapa_cmd ${TAPA_UNPARSED_ARGUMENTS}) line right before list(APPEND tapa_cmd link) in /usr/lib/cmake/tapa/TAPACCConfig.cmake

But when I was checking the TAPACCConfig.cmake file, I saw that if TAPA_CONNECTIVITY AND TAPA_CONSTRAINT arguments are specified as I did in my CMake file as arguments to add_tapa_target custom function :

CONNECTIVITY ${CMAKE_CURRENT_SOURCE_DIR}/connectivity.ini
  CONSTRAINT ${CMAKE_CURRENT_BINARY_DIR}/constraint.tcl

So floorplanning will be automatically set and AutoBridge will run? So this means I dont have to pass --run-floorplanning flag. Right?

Blaok commented 1 year ago

now I received an error saying: ERROR: [VPL 101-2] ERROR: [Vivado 12-1041] No pblock specified to add instances to.

Looks like the CMakeLists.txt for apps/bandwidth is outdated. The connectivity is already specified in link_config.ini and the —connectivity.sp arguments should be removed from CMakeLists.txt.

So floorplanning will be automatically set and AutoBridge will run? So this means I dont have to pass --run-floorplanning flag. Right?

That’s exactly right. Floorplanning is enabled if CONNECTIVITY and CONSTRAINT are supplied. This hasn’t changed since AutoBridge was first integrated.

enes1994 commented 1 year ago

Thanks a lot for the answers. They are really helpful. just a curiosity, Are there any reason to use --run-floorplanning in serpens's CMake file?

Blaok commented 1 year ago

Are there any reason to use --run-floorplanning in serpens's CMake file?

I think it was required for some versions of TAPA, when CMake was still using the tapac CLI, where --run-floorplanning is required to include floorplanning as one of the steps TAPA run. There was some debate regarding if --run-floorplanning should be explicitly required if a floorplan output is supplied; the decision was to make it a required argument so users acknowledge that they are running floorplanning. This argument is not applicable in the new tapa CLI, where the optimize-floorplan sub-command is its equivalent.

@linghaosong please correct me if I'm wrong :)

linghaosong commented 1 year ago

Are there any reason to use --run-floorplanning in serpens's CMake file?

I think it was required for some versions of TAPA, when CMake was still using the tapac CLI, where --run-floorplanning is required to include floorplanning as one of the steps TAPA run. There was some debate regarding if --run-floorplanning should be explicitly required if a floorplan output is supplied; the decision was to make it a required argument so users acknowledge that they are running floorplanning. This argument is not applicable in the new tapa CLI, where the optimize-floorplan sub-command is its equivalent.

@linghaosong please correct me if I'm wrong :)

Right. We used --run-floorplanning in the early version, but in the latest version we don't need it anymore.

@enes1994 Following you can find the current options we passed to TAPA https://github.com/linghaosong/Serpens/blob/main/run_tapa.sh

enes1994 commented 1 year ago

Thank you very much to all of you. You are doing a great job here by helping the community with really helpful tools.