rachelselinar / DREAMPlaceFPGA

An Open-Source Analytical Placer for Large Scale Heterogeneous FPGAs using Deep-Learning Toolkit
BSD 3-Clause "New" or "Revised" License
75 stars 18 forks source link

Placement Runtime in DREAMPlaceFPGA #20

Closed Yk2Zcc closed 1 year ago

Yk2Zcc commented 1 year ago

Why did I experiment with the ISPD'2016 FPGA01 benchmark on a Linux server that consists of an Intel (R) Xeon (R) W-2123 CPU @ 3.60GHz (8 cores) and the result was a GP of 17.94 seconds and an L+D of 52.141 seconds?

rachelselinar commented 1 year ago

Hi Kang, Could you please share the run log? It is hard to tell what options you used to run the tool.

Yk2Zcc commented 1 year ago

Benchmark:ISPD'2016 FPGA03 CPU: Intel i9-10920X (24) @ 4.800GHz GPU: NVIDIA TITAN Xp run in 24 threads:

[INFO   ] DREAMPlaceFPGA - Placement completed in 22.48 seconds
[INFO   ] DREAMPlaceFPGA - write placement solution to results/design/design.gp.pl took 0.567 seconds
[INFO   ] DREAMPlaceFPGA - Legalization and Detailed Placement run using elfPlace (CPU): ./thirdparty/elfPlace_LG_DP --aux benchmarks/ispd2016/FPGA03/design.aux --numThreads 24 --pl results/design/design_final.pl
[INF 2023-09-20 17:10:21    0.00 sec]  ----- Command-Line Options -----
[INF 2023-09-20 17:10:21    0.00 sec]  numThreads = 24
[INF 2023-09-20 17:10:21    0.00 sec]  --------------------------------
[INF 2023-09-20 17:10:21    0.00 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.aux
[INF 2023-09-20 17:10:21    0.00 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.lib
[INF 2023-09-20 17:10:21    0.00 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.scl
[INF 2023-09-20 17:10:21    0.01 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.nodes
[INF 2023-09-20 17:10:22    0.32 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.pl
[INF 2023-09-20 17:10:22    0.32 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.nets
[INF 2023-09-20 17:10:24    3.00 sec]  GP instance stddev = 2.05, trunc = 2.50
[INF 2023-09-20 17:10:24    3.00 sec]  Import placement from file gp.pl
[INF 2023-09-20 17:14:41  259.53 sec]  Export solution to file results/design/design_final.pl
[INFO   ] DREAMPlaceFPGA - Legalization and detailed placement completed in 260.486 seconds
[INFO   ] DREAMPlaceFPGA - Completed Placement in 283.556 seconds

run in 12 threads:

[INFO   ] DREAMPlaceFPGA - Placement completed in 22.26 seconds
[INFO   ] DREAMPlaceFPGA - write placement solution to results/design/design.gp.pl took 0.595 seconds
[INFO   ] DREAMPlaceFPGA - Legalization and Detailed Placement run using elfPlace (CPU): ./thirdparty/elfPlace_LG_DP --aux benchmarks/ispd2016/FPGA03/design.aux --numThreads 12 --pl results/design/design_final.pl
[INF 2023-09-20 17:17:34    0.00 sec]  ----- Command-Line Options -----
[INF 2023-09-20 17:17:34    0.00 sec]  numThreads = 12
[INF 2023-09-20 17:17:34    0.00 sec]  --------------------------------
[INF 2023-09-20 17:17:34    0.00 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.aux
[INF 2023-09-20 17:17:34    0.00 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.lib
[INF 2023-09-20 17:17:34    0.00 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.scl
[INF 2023-09-20 17:17:35    0.02 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.nodes
[INF 2023-09-20 17:17:35    0.32 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.pl
[INF 2023-09-20 17:17:35    0.32 sec]  Parsing file benchmarks/ispd2016/FPGA03/design.nets
[INF 2023-09-20 17:17:37    3.00 sec]  GP instance stddev = 2.05, trunc = 2.50
[INF 2023-09-20 17:17:37    3.00 sec]  Import placement from file gp.pl
[INF 2023-09-20 17:22:01  266.03 sec]  Export solution to file results/design/design_final.pl
[INFO   ] DREAMPlaceFPGA - Legalization and detailed placement completed in 267.003 seconds
[INFO   ] DREAMPlaceFPGA - Completed Placement in 289.904 seconds
Yk2Zcc commented 1 year ago

its my FPGA03 json

{
    "aux_input" : "benchmarks/ispd2016/FPGA03/design.aux",
    "gpu" : 1,
    "num_threads" : 12,
    "num_bins_x" : 512,
    "num_bins_y" : 512,
    "global_place_stages" : [
        {"num_bins_x" : 512, "num_bins_y" : 512, "iteration" : 2000, "learning_rate" : 0.01, "wirelength" : "weighted_average", "optimizer" : "nesterov"}
    ],
    "target_density" : 1.0,
    "density_weight" : 8e-5,
    "random_seed" : 1000,
    "scale_factor" : 1.0,
    "global_place_flag" : 1,
    "legalize_flag" : 0,
    "detailed_place_flag" : 0,
    "dtype" : "float32",
    "deterministic_flag" : 0
}
rachelselinar commented 1 year ago

When '_global_placeflag' is set to 1 and '_legalizeflag' is set to 0, Global placement is run on DREAMPlaceFPGA, and the remaining stages - legalization (LG) and detailed placement (DP) are run using elfPlace binary included in the thirdparty folder. When running LG and DP using elfPlace, the runtime is affected by the available number of threads in the machine irrespective of the thread count set in the json file. Any other jobs running on the same machine will affect the CPU runtime. Please try a closed experiment with no other jobs running on the CPU to see impact of the number of threads.

As GP is accelerated on GPU and LG + DP is run on CPU, the runtime for LG + DP is expected to be larger than GP. To run LG on GPU, set the '_legalizeflag' to 1.

Yk2Zcc commented 1 year ago

Thank you very much!

---- Replied Message ---- | From | Rachel Selina @.> | | Date | 10/10/2023 03:26 | | To | rachelselinar/DREAMPlaceFPGA @.> | | Cc | Kang @.>, Author @.> | | Subject | Re: [rachelselinar/DREAMPlaceFPGA] Placement Runtime in DREAMPlaceFPGA (Issue #20) |

Closed #20 as completed.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>