verilog-to-routing / vtr-verilog-to-routing

Verilog to Routing -- Open Source CAD Flow for FPGA Research
https://verilogtorouting.org
Other
1k stars 385 forks source link

Passing only Fc max to pin to track map function #2641

Open saaramahmoudi opened 2 months ago

saaramahmoudi commented 2 months ago

Current Behaviour

While we want to create connection blocks (pin->track and track->pin connections), we first loop over segments and calculate the maximum Fc over all pins of one block and this segment type (rr_graph.cpp, line 3253):

/* determine the maximum Fc to this segment type across all pins */
  int max_Fc = 0;
  for (int pin_index = 0; pin_index < Type->num_pins; ++pin_index) {
      int pin_class = Type->pin_class[pin_index];
      if (Fc[pin_index][seg_inf[iseg].seg_index] > max_Fc && Type->class_inf[pin_class].type == pin_type) {
          max_Fc = Fc[pin_index][seg_inf[iseg].seg_index];
      }
  }

Afterward, we use max_Fc calculated to pass through the pin_to_seg_type_map to create connections based on max_Fc:

/* get pin connections to tracks of the current segment type */
auto pin_to_seg_type_map = alloc_and_load_pin_to_seg_type(pin_type, num_seg_type_tracks, max_Fc, Type, type_layer, perturb_switch_pattern[seg_inf[iseg].seg_index], directionality);

Although we use the maximum Fc value, we can override some pins (such as the carry chain) with Fc 0. However, assuming one pin has 4 connections to this segment and the other has 6 connections to this segment, we connect both pins 6 times since the max_Fc would be 6. Here is the comment for the mentioned function (rr_graph.cpp, line 3298):

/* Note: currently a single value of Fc is used across each pin. In the future
     * the looping below will have to be modified if we want to account for pin-based
     * Fc values 
*/

@vaughnbetz If you agree that it is better to account for pin-based Fc values, I can implement and keep this issue to track the progress. If there is a specific reason for this implementation, I can add more comments to the function to avoid confusion.

vaughnbetz commented 2 months ago

Thanks for the detailed analysis Sara. I agree it would be better to change this; I can't really see what the reason for it is.

WhiteNinjaZ commented 1 month ago

@saaramahmoudi I have actually been working on this issue for a while now and didn't see that there was already someone working on it till recently. I have applied a temporary fix that seems to work well. Here is what I did: alloc_and_load_pin_to_seg_type() should only be called once per segment type since it keeps track of which pins connect to which segments. What I ended up doing was just making a very simple change to the nested for loop as follows:

for (auto type_layer_index : type_layer) {
            for (int ipin = 0; ipin < Type->num_pins; ipin++) {
                int cur_Fc = Fc[ipin][seg_inf[iseg].seg_index]; // Changed this 
                for (int iwidth = 0; iwidth < Type->width; iwidth++) {
                    for (int iheight = 0; iheight < Type->height; iheight++) {
                        for (int iside = 0; iside < 4; iside++) {
                            for (int iconn = 0; iconn < cur_Fc; iconn++) { // And this
                                for (auto connected_layer : get_layers_pin_is_connected_to(Type, type_layer_index, ipin)) {
                                    int relative_track_ind = pin_to_seg_type_map[ipin][iwidth][iheight][connected_layer][iside][iconn];
                                    if (relative_track_ind != OPEN) {
                                        VTR_ASSERT(relative_track_ind <= num_seg_type_tracks);
                                        int absolute_track_ind = relative_track_ind + seg_type_start_track;

                                        VTR_ASSERT(absolute_track_ind >= 0);
                                        result[ipin][iwidth][iheight][connected_layer][iside].push_back(
                                            absolute_track_ind);
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }

Even though some pins technically have more connections in the pin_to_seg_type_map[] the extra connections are never reached because we terminate after making only cur_Fc connections which is on a per pin basis. Note: this does have some extra overhead since we are filling out some parts of the pin_to_seg_type_map[] matrix that are never used, but it was the quickest way to apply a fix for this problem and the Xilinx arch is heavily dependent upon this feature working properly. This fix might be worth considering until a more robust refactor of the code is done.

WhiteNinjaZ commented 1 month ago

While debugging I did notice a few more problems related to specifying specific parameters for specific pins using the fc_override feature. It appears that the XML parser has a few issues with processing the overrides: 1) There does not seem to be any kind of parsing on the fc_override port_overide specification. Whatever is in the string after port_name= is just taken in its raw form. For example, if the same format (i.e. multiple ports specified in a single line by separating them with spaces "clb.I clb.O clb.cin") is used in this child tag then the full list is taken to be a single port name (i.e. "clb.I clb.O clb.cin" is passed in) instead of a vector of different port names. 2) If you look at read_xml_arch_file.cpp lines 1958-1962 we see that the following if statement determines when an override should be applied:

if (fc_override.port_name == port.name && fc_override.seg_name == segments[iseg].name) {
                            apply_override = true;
                        }

The problem is that the port names must match exactly and there is no error checking in this if/else statement. So, if someone were to use the pretty standard way of specifying the override port as <name of tile>.<name of port> (i.e. "clb.I") the override is just ignored--no error, no warning, just flat out ignored. This also happens if a port that does not exist is specified in the fc_overide or if specific subsets of a certain port are used (i.e. clb.I[23:0]). I think this is mostly caused by the upstream parser just not doing proper error checking/parsing on the fc_override port parameter. As mentioned in the last point, whatever is in the string immediately following port_name= is just taken in its raw form.

@vaughnbetz not sure if this warrants a completely separate issue or is close enough to this to be in the same vein.

vaughnbetz commented 1 month ago

Thanks @WhiteNinjaZ . We should definitely fix the weak parsing / lack of error checking too. I don't have a strong preference of whether it is in a separate issue or this one, but we should have an issue open until that is fixed. If you know how to fix it (and it seems you're close) I suggest making a PR while it is fresh in your mind.

saaramahmoudi commented 1 month ago

@WhiteNinjaZ I tried to fix the issue by passing the Fc matrix to all of those underlying functions, and I also included your change to avoid an extra loop through the 6D array for connections. I would appreciate it if you could check my branch to see if everything works properly and as you expected. The branch name is: pass_actual_fc_to_connect_to_pins

WhiteNinjaZ commented 1 month ago

@saaramahmoudi looks good, thanks for doing that. I ran the vtr_reg_strong QOR on your changes and got good results. Also, ran the changes on a few benchmarks and verified that the specified fc segment types were connected properly in the RR using the GUI.

WhiteNinjaZ commented 1 month ago

The problems stated in issue #2658 remain to be rectified, however.

saaramahmoudi commented 1 month ago

@WhiteNinjaZ Thanks for checking the branch. Yes, I think that is a separate issue. I will try to address that in a near future.