verilog-to-routing / vtr-verilog-to-routing

Verilog to Routing -- Open Source CAD Flow for FPGA Research
https://verilogtorouting.org
Other
1k stars 386 forks source link

Generalize sub_tile in the grid so sub_tiles can have different width & height #2197

Closed vaughnbetz closed 1 year ago

vaughnbetz commented 1 year ago

Proposed Behaviour

We'd like to model multi-die stack FPGAs. This is needed for the crossroads project, and will likely be useful to others.

Current Behaviour

The sub_tile feature almost has what we need in order to model one block under another, but it has the restriction that all sub_tiles in a tile must be the same size. We'd like to model a large router (or large accelerator block) that is underneath multiple LABs (or maybe RAMs, etc.). Right now we can't do this -- if we make the router large (e.g. width = 3, height = 3), then any LABs in the architecture become wide and tall.

Possible Solution

I think if we make the width, height, x_offset and y_offset in the grid a function of the sub_tile instead of a tile, we can have this feature. The capacity of grid[i][j] would then give the number of sub_tiles overlapping a certain (x,y) location, and you'd go through all those sub_tiles to see where their anchor points where, what their widths and heights are etc.

We should also use this change as an opportunity to wrap some member functions around access to the sub_tiles to ask questions about them and anchor points etc. (I'm assuming we have code directly accessing this data now).

Context

Impacting exploring routers with logic over them for the Crossroads project.

Alternative proposal

We could add a z dimension to the grid, making the layers explicit. But I believe this would involve more code updates, and probably a higher burden on more code to think about / set the z-coordinate, which would always be 0 for conventional planar architectures.

vaughnbetz commented 1 year ago

@kmurray @tangxifan : FYI. Sara is going to dig into this, with an eye to discussing in a future vtr meeting.

tangxifan commented 1 year ago

@vaughnbetz Thanks for the info. Will do.

vaughnbetz commented 1 year ago

@saaramahmoudi : can you add a link to your diagrams on this? Adding @MohamedElgammal . The two main options seem to be:

  1. add an explicit z dimension. There would be a grid[i][j].layer_cap that said how many physical tiles were stacked at that point. And then all current grid data would move to grid[i][j][z_layer].

    • negative: lots of code to change. Code all has to work with multiple layers; may be slightly slower etc.
    • positive: existing code should work, as we don't really change the meaning of a physical tile etc.
  2. Move and redefine the existing capacity of sub_tiles. grid[i][j].capacity would now exist. Each subtile would know more about itself (width, height) etc.

    • positive: probably less code changes. Have one capacity that says what is stacked on top of each other instead of two.
    • negative: physical_tile currently groups multiple subtiles and allows their pins to be grouped and located around the composite physical_tile in various ways, in order to build an rr_graph. May be hard to keep that behaviour unchanged while re-imaginging subtile this way.
saaramahmoudi commented 1 year ago

Different diagrams for our proposed solutions to this problem are shown below. Blocks with green are the ones that we should add to the existing code.

  1. The first diagram is a straightforward approach in which we don't need to change sub-tile and tile structures or even how the grid exists. But, we need to store sub-tile attributes multiple times (as many as LAB existed on top of the NoC)

Screenshot from 2022-11-21 11-01-26

  1. The second diagram Vaughn mentioned in the previous comment as adding a new explicit z dimension.

Screenshot from 2022-11-21 11-03-27

  1. The final diagram shows how we should change sub_tiles and tiles definitions and add capacity to the grid itself.

Screenshot from 2022-11-21 11-05-23

  1. Moving sub_tiles array from physical_tile structure to the grid might also solve the problem. The following diagram shows how this solution is different from number 3.

Screenshot from 2022-11-21 14-37-56

saaramahmoudi commented 1 year ago

@vaughnbetz Updating the issue after our meeting with Kevin. Presentation is attached for @tangxifan to review and discuss later. Multi-die stack FPGAs.pdf

tangxifan commented 1 year ago

@vaughnbetz @saaramahmoudi Thanks for the inputs with detailed explanation.

I personally prefer the solution by adding an explicit z dimension, i.e., grid[i][j][z_layer]. Here are my reasons:

Actually, I am not 100% clear to which level of 3D stacked FPGAs, that VPR would like to support. My understanding is that as an architecture exploration tool, VPR aims to support very flexible FPGA architectures. I have a few questions:

These are my views based on current knowledge. I am definitely interested in the technical feature, and OpenFPGA will support it. I would like to discuss more details and open to alter my views.

vaughnbetz commented 1 year ago

Thanks @tangxifan . Your opinion matches the emerging consensus -- add an explicit z for clarity. So we'll go with that.

tangxifan commented 1 year ago

Thanks @vaughnbetz for the details. Now it is clear to me. Do we expect any changes on the arch XML when supporting the 3D-stacked FPGAs? I am thinking about where we should define the z for the grids. For example,

    <fixed_layout name="2x2" width="4" height="4" num_layers="2">
      <!--Perimeter of 'io' blocks with 'EMPTY' blocks at corners-->
      <perimeter type="io" priority="100"/>
      <corners type="EMPTY" priority="101"/>
      <!--Fill with 'clb'-->
      <fill type="clb_die0" priority="10" layer="0"/>
      <fill type="clb_die1" priority="10" layer="1"/>
    </fixed_layout>
saaramahmoudi commented 1 year ago

@tangxifan This is how we discuss to implement it on the architecture file. Layer tag is going to be optional, so we don't need to update existing architecture files, and die number is considered to be 0 if unspecified. Number of available layers is also can be calculated using die numbers attribute on the layer tag (will be 1 if left unspecified). Does it sound a right way to do it to you?

    <fixed_layout name="first_layer">
        <layer die="0">
          <!--Perimeter of 'io' blocks with 'EMPTY' blocks at corners-->
          <perimeter type="io" priority="100"/>
          <corners type="EMPTY" priority="101"/>
          <!--Fill with 'clb'-->
          <fill type="clb_die0" priority="10"/>
        </layer>
    </fixed_layout>
    <fixed_layout name="second_layer">
        <layer die="1">
          <!--Perimeter of 'io' blocks with 'EMPTY' blocks at corners-->
          <perimeter type="io" priority="100"/>
          <corners type="EMPTY" priority="101"/>
          <!--Fill with 'clb'-->
          <fill type="clb_die1" priority="10"/>
        </layer>
    </fixed_layout>
tangxifan commented 1 year ago

@saaramahmoudi Yes. This is a good one to me. Thanks!

vaughnbetz commented 1 year ago

We could make either of these work (they're pretty close in meaning). Sara, I suggest going through the proposed arch syntax (or both alternatives) in a future vtr meeting (this Thursday if you're ready already). Could just show the alternatives from this issue.

ganeshgore commented 1 year ago

Hello, based on above discussion. Is this the correct way to interpret how pins on different Z levels will be flattened in the 2D RRGraph?

image

vaughnbetz commented 1 year ago

Discussed in meeting today. Consensus: go with z coordinate, go with Xifan's proposed syntax, make layer attribute optional /defaul to 0 so existing archs work). For the rr-graph: current plan is not to change the rr-graph much; one layer of programmable routing still. Devices on the z=1 layer may connect to the programmable routing via connection boxes though (or they may connect via a NoC). If connecting to the programmable routing, their pins should have z=1 so we can tell what layer they are on. So likely we'll wind up putting a z-coordinate (default to 0) on the rr-graph just for annotation of the pins / drawing etc. To minimize memory bloat, we may be able to put such a z-coordinate in the flyweight (rr-indexed data) where the cost_index points to it (since we wouldn't have many z coordinates at all).

tangxifan commented 1 year ago

@vaughnbetz I thought twice. The shortcoming of my syntax is that it may not be easy for engineers to spot grids on specific layers, when the size of <layout> block grows. Imagine there are 100 lines of grid definition under a layout.

I suggest to combine mine and @saaramahmoudi 's together, as follows.

    <fixed_layout name="2-layer-stacked-fpga" height="4" width="4">
        <layer die="0">
          <!--Perimeter of 'io' blocks with 'EMPTY' blocks at corners-->
          <perimeter type="io" priority="100"/>
          <corners type="EMPTY" priority="101"/>
          <!--Fill with 'clb'-->
          <fill type="clb_die0" priority="10"/>
        </layer>
        <layer die="1">
          <!--Perimeter of 'io' blocks with 'EMPTY' blocks at corners-->
          <perimeter type="io" priority="100"/>
          <corners type="EMPTY" priority="101"/>
          <!--Fill with 'clb'-->
          <fill type="clb_die1" priority="10"/>
        </layer>
    </fixed_layout>

This will allows us to have a clear view on how each layer look like. Engineers/developers can remove or rework a layer in a straightforward way.

It does not need to update existing architecture files. If there are no <layer> defined, we create a default layer when parsing the architecture XML.

Let me know what you think. I can help @saaramahmoudi if she needs advises on developing the parser.

vaughnbetz commented 1 year ago

Sure, that is fine with me.

I think with your earlier syntax people could still make it readable by grouping all the layer0 grid entries together, then all the layer 1 entries, with a comment line in between if they liked. The syntax wouldn't force them to to that, but they could if they wished.

Forcing such an organization may be cleaner though, as you suggest. I'm OK with either.

tangxifan commented 1 year ago

@vaughnbetz Yes. I see that people can always find a way to make their architecture XML clean. But most of time, they eventually fail to do so, due to various reasons (for instance, very tight deadlines and too many revisions). Once we have many lines in a <layout> block, it is difficult for developers to spot layout_id in each line and ensure that they are correct. Therefore, I believe we should provide developers the syntax that forces them to follow a clean way.

In addition, it is also straightforward to implement the parser for the <layout>. You can re-allocate memory by counting the number of <layout> blocks under a <fixed_layout> or a <auto_layout>. Otherwise, you may have to parse all the lines and find out the number of layers, and then allocate memory.

vaughnbetz commented 1 year ago

Good points; I agree.