f4pga / f4pga-arch-defs

FOSS architecture definitions of FPGA hardware useful for doing PnR device generation.
https://f4pga.org
ISC License
272 stars 113 forks source link

VPR sensitivity to input changes #1823

Open acomodi opened 3 years ago

acomodi commented 3 years ago

This issue is to keep track of the issue reported here: https://github.com/SymbiFlow/symbiflow-arch-defs/issues/1776.

This issue has been split from https://github.com/SymbiFlow/symbiflow-arch-defs/issues/1788, as it tackles a different problem.

Problem statement

The auto-generated designs, such as litex, present some small variations in the memory initialization of the BRAMs. This causes a small perturbation in the initial conditions of a test, which lead to major changes in the output results.

This problem might not be only related to the BRAM initialization, but to any small changes that are applied to a specific design, but for the sake of this issue description, we will keep the small BRAM initialization changes as the triggering factor for this issue.

This issue covers the sensitivity of the synthesis step which produces two very different eblifs given a small change in the memory initialization values.

Packer

The perturbation in the nets ordering of the circuit might affect the way the packer acts. For instance, two different runs of the same test produced the following packing results:

Test 1:

Resource usage...
        Netlist
                1099    blocks of type: BLK-TL-SLICEL
        Architecture
                2150    blocks of type: BLK-TL-CLBLL_L
                1200    blocks of type: BLK-TL-CLBLL_R
                1800    blocks of type: BLK-TL-CLBLM_L
                3000    blocks of type: BLK-TL-CLBLM_R

Test 2:

Resource usage...
        Netlist
                1114    blocks of type: BLK-TL-SLICEL
        Architecture
                2150    blocks of type: BLK-TL-CLBLL_L
                1200    blocks of type: BLK-TL-CLBLL_R
                1800    blocks of type: BLK-TL-CLBLM_L
                3000    blocks of type: BLK-TL-CLBLM_R

The two Verilog descriptions of the design are identical, except for the memory initialization.

A possible solution to this behaviour is to make the packer algorithms more robust to changes in the ordering of the input circuit, making it less sensitive to changes in input conditions.

Placer

Initial placement is currently very sensitive to the seed. In fact, changing the seed can generate very different outcomes, in terms of CPD as well as router run-time.

This is proven by the following test, which, taken two initially equal packed netlists and two different SEEDs, the router behaviour drastically changed:

default seed:

## Initializing router criticalities took 0.03 seconds (max_rss 3353.3 MiB, delta_rss +0.0 MiB)
---- ------ ------- ---- ------- ------- ------- ----------------- --------------- -------- ---------- ---------- ---------- ---------- --------
Iter   Time    pres  BBs    Heap  Re-Rtd  Re-Rtd Overused RR Nodes      Wirelength      CPD       sTNS       sWNS       hTNS       hWNS Est Succ
      (sec)     fac Updt    push    Nets   Conns                                       (ns)       (ns)       (ns)       (ns)       (ns)     Iter
---- ------ ------- ---- ------- ------- ------- ----------------- --------------- -------- ---------- ---------- ---------- ---------- --------
Warning 108: 6 timing startpoints were not constrained during timing analysis
Warning 109: 1521 timing endpoints were not constrained during timing analysis
   1   17.0     0.0    0 2.2e+08    7670   25926   12853 ( 0.439%)  332480 ( 5.2%)   18.939     -70.37     -2.273      0.000      0.000      N/A
   2    5.0     2.8    2 4.9e+07    5694   18419    6520 ( 0.223%)  363815 ( 5.7%)   18.424     -52.74     -1.758      0.000      0.000      N/A
   3    4.1     3.4    4 3.7e+07    4113   13278    4585 ( 0.157%)  382077 ( 5.9%)   18.440     -63.06     -1.774      0.000      0.000      N/A
   4    3.6     4.1    4 3.2e+07    3018   10305    2985 ( 0.102%)  396871 ( 6.2%)   18.349     -49.51     -1.683      0.000      0.000      N/A
   5    3.3     4.9    9 2.8e+07    2131    7832    1730 ( 0.059%)  410556 ( 6.4%)   18.377     -53.28     -1.711      0.000      0.000      N/A
   6    3.1     5.9    4 2.4e+07    1396    5555     947 ( 0.032%)  421353 ( 6.6%)   18.409     -52.52     -1.743      0.000      0.000      N/A
   7    1.9     7.0    6 1.5e+07     819    3461     437 ( 0.015%)  428970 ( 6.7%)   18.409     -54.91     -1.743      0.000      0.000      N/A
   8    1.6     8.4    5 1.1e+07     437    1848     190 ( 0.006%)  433440 ( 6.7%)   18.406     -58.14     -1.740      0.000      0.000      N/A
   9    0.5    10.1    8 3655223     202     735      65 ( 0.002%)  435479 ( 6.8%)   18.389     -57.54     -1.723      0.000      0.000      N/A
  10    0.8    12.2    3 4537900      76     281      19 ( 0.001%)  436226 ( 6.8%)   18.389     -58.77     -1.723      0.000      0.000       14
  11    0.1    14.6    0  577913      23      78       4 ( 0.000%)  436748 ( 6.8%)   18.389     -59.10     -1.723      0.000      0.000       13
  12    0.0    17.5    0  154050       5      11       1 ( 0.000%)  436723 ( 6.8%)   18.389     -59.10     -1.723      0.000      0.000       13
  13    0.0    21.0    0   19600       1       5       0 ( 0.000%)  436780 ( 6.8%)   18.389     -59.85     -1.723      0.000      0.000       12

custom seed (1000):

## Initializing router criticalities took 0.03 seconds (max_rss 3353.0 MiB, delta_rss +0.0 MiB)
---- ------ ------- ---- ------- ------- ------- ----------------- --------------- -------- ---------- ---------- ---------- ---------- --------
Iter   Time    pres  BBs    Heap  Re-Rtd  Re-Rtd Overused RR Nodes      Wirelength      CPD       sTNS       sWNS       hTNS       hWNS Est Succ
      (sec)     fac Updt    push    Nets   Conns                                       (ns)       (ns)       (ns)       (ns)       (ns)     Iter
---- ------ ------- ---- ------- ------- ------- ----------------- --------------- -------- ---------- ---------- ---------- ---------- --------
Warning 108: 6 timing startpoints were not constrained during timing analysis
Warning 109: 1521 timing endpoints were not constrained during timing analysis
   1   19.2     0.0    0 2.3e+08    7670   25926   13086 ( 0.447%)  340954 ( 5.3%)   19.869     -132.1     -3.203      0.000      0.000      N/A
   2    5.3     2.8    4 4.5e+07    5679   18278    6717 ( 0.229%)  371999 ( 5.8%)   19.854     -130.3     -3.188      0.000      0.000      N/A
   3    4.3     3.4    7 3.4e+07    4149   13441    4758 ( 0.162%)  391802 ( 6.1%)   19.774     -138.9     -3.108      0.000      0.000      N/A
   4    4.1     4.1    4 3.2e+07    3005   10744    3210 ( 0.110%)  407592 ( 6.3%)   19.761     -137.1     -3.095      0.000      0.000      N/A
   5    3.8     4.9    2 2.8e+07    2148    8274    1896 ( 0.065%)  423111 ( 6.6%)   19.839     -143.1     -3.173      0.000      0.000      N/A
   6    2.7     5.9    7 2.0e+07    1420    5994    1071 ( 0.037%)  432306 ( 6.7%)   19.858     -154.0     -3.192      0.000      0.000      N/A
   7    1.8     7.0   11 1.3e+07     890    3846     509 ( 0.017%)  440328 ( 6.9%)   19.879     -157.9     -3.213      0.000      0.000      N/A
   8    1.2     8.4    8 8812194     450    1885     224 ( 0.008%)  445947 ( 6.9%)   19.923     -163.2     -3.257      0.000      0.000      N/A
   9    1.0    10.1    4 6420327     212     870      78 ( 0.003%)  448470 ( 7.0%)   19.911     -162.1     -3.245      0.000      0.000      N/A
  10    0.3    12.2    4 2119048      82     279      31 ( 0.001%)  449305 ( 7.0%)   19.897     -161.4     -3.231      0.000      0.000       15
  11    0.2    14.6    2 1247043      40     131      12 ( 0.000%)  449878 ( 7.0%)   19.911     -163.3     -3.245      0.000      0.000       14
  12    0.1    17.5    1  798032      17      39       6 ( 0.000%)  450097 ( 7.0%)   19.911     -163.3     -3.245      0.000      0.000       14
  13    0.2    21.0    1  962631      11      23       4 ( 0.000%)  450117 ( 7.0%)   19.911     -163.3     -3.245      0.000      0.000       14
  14    0.2    25.2    0  744835       5       5       2 ( 0.000%)  450255 ( 7.0%)   19.911     -163.3     -3.245      0.000      0.000       14
  15    0.0    30.3    0  104536       2       4       0 ( 0.000%)  450311 ( 7.0%)   19.911     -163.3     -3.245      0.000      0.000       15

On a separate test, I have verified and confirmed that having exact same input conditions (seed, packed netlist, etc) produces the same outputs.

Steps to reproduce

  1. Get the latest version of symbiflow-arch-defs (https://github.com/SymbiFlow/symbiflow-arch-defs/commit/991c74f0e1d579f143878b22d72ab65235f3c18c at the moment of writing)
  2. Run one of the litex tests as follows:
    cd symbiflow-arch-defs
    make env
    cd build
    make minilitex_arty_bit
  3. Save all the run results somewhere
  4. Go to the minilitex_arty build directory and delete the arty_soc directory.
    cd build/xc/xc7/tests/soc/litex/mini && rm -r arty_soc
  5. Re-run the minilitex test, which will trigger a new litex-design generation:
    make minilitex_arty_bit
  6. Compare the newly generated results with the ones saved from the previous run.
tcal-x commented 2 years ago

Related: https://github.com/enjoy-digital/litex/issues/1108