lnis-uofu / OpenFPGA

An Open-source FPGA IP Generator
https://openfpga.readthedocs.io/en/master/
MIT License
813 stars 160 forks source link

some question about synthesis #473

Closed kangliyu1 closed 2 years ago

kangliyu1 commented 2 years ago

Hello! I encountered a very strange problem. I synthesized a piece of code in yosys in openfpga (code is as follows code1). An error was reported during the synthesis process, as shown in Figure 1, but no error was pointed out in yosys_output.log Locally, alone, yosys found out that kill automatically exited in the corresponding place, so I am a little confused. I hope some seniors can give me some advice. Thank you very much! ! Note: The architecture uses k6_frac_N10_adder_chain_dpram8K_dsp36_fracff_40nm_openfpga.xml, k6_frac_N10_tileable_adder_chain_dpram8K_dsp36_fracff_40nm.xml, iwls_benchmark_example_script.openfpga

fig1: image

fig2: image

code1:

module Conv_2#(
        parameter padding       =      0                                ,
        parameter weight_bit    =      10                               ,//有符号位宽
        parameter conv_bit      =      10                               ,//无符号位宽
        parameter conv_wid      =      13                               ,
        parameter conv_heig     =      13                               ,
        parameter stride        =      1                                ,
        parameter result_len    =      11*11                            ,
        parameter in_layer      =      6                                ,
        parameter out_layer     =      16                                
)(
        input                                                                           clk             ,
        input                                                                           rst             ,
        input           [in_layer*conv_bit-1:0]                                         conv_din        ,
        input                                                                           din_vld         ,
        input                                                                           din_last        ,
        input           [0:1*in_layer*9*weight_bit-1]                                   kernel          ,
        output          [in_layer*out_layer*(4+weight_bit+conv_bit)/1-1:0]              conv_dout       ,
        output                                                                          busy            ,
        output          [clogb2(out_layer)-1:0]                                         kernel_addr     ,
        output  reg                                                                     kernel_vld      ,
        output  reg                                                                     dout_vld        ,
        output  reg                                                                     dout_last       

    );
        localparam rd_init1              =   3'b000       ,
                   rd_init2              =   3'b001       ,
                   rd_init3              =   3'b010       ,
                   rd_conv               =   3'b011       ; 
        reg     signed  [4+weight_bit+conv_bit-1:0]   add_res   [0:in_layer/1-1][0:out_layer-1]                 ;
        reg     signed  [weight_bit+conv_bit-1:0]     mul_res   [0:in_layer/1-1][0:out_layer-1][0:9-1]          ;  
        wire    signed  [weight_bit-1:0]              kernel_val[0:in_layer/1*1*9-1]                            ;
        reg     signed  [conv_bit:0]                  conv_buf  [0:1-1][0:in_layer/1-1][0:9-1]                  ;        
        wire            [in_layer*conv_bit-1:0]       row_out   [0:3-1]                                           ;             
        wire            [in_layer*conv_bit-1:0]       row_in    [0:3-1]                                           ;      
        reg     [9:0]                         cnt_row0                          ;       
        reg     [9:0]                         cnt_row1                          ;       
        reg     [9:0]                         cnt_row2                          ;           
        wire    [2:0]                         row_wr                            ;
        wire    [2:0]                         row_rd                            ;
        reg     [1:0]                         cnt_stride_w                      ;
        reg     [1:0]                         cnt_stride_h                      ;
        reg                                   row0_rd                           ;
        reg                                   row1_rd                           ;
        reg                                   row2_rd                           ;
        reg                                   conv_idle                         ;
        reg                                   conv_vld                          ;
        reg                                   last_vld                          ;
        reg     [clogb2(conv_wid)-1:0]        cnt_conv                          ;
        reg     [clogb2(out_layer)-1:0]       cnt_kernel                        ;
        reg     [2-1:0]                       cnt_paral                         ;
        wire    [2:0]                         row_full                          ;
        wire    [2:0]                         row_empty                         ;
        reg     [19:0]                        cnt_din                           ;
        reg     [2:0]                         state_rd                          ;
        reg     [0:0]                         row1_add_flag                     ;
        reg     [0:0]                         row0_add_flag                     ;
        reg     [clogb2(result_len)-1:0]      cnt_out                           ;
        genvar                  a, b, c, i, j, k, l, m, n, o                    ;
        assign conv_rst         =   din_last && din_vld && !end_din || rst      ;
        assign row_wr[2]        =   din_vld && !row_full[2]                     ;
        assign row_rd[2]        =   !row_empty[2] && row2_rd                    ;
        assign row_in[2]        =   conv_din                                    ;
        assign row_wr[1]        =   row_rd[2] && !row_full[1]                   ;
        assign row_in[1]        =   row_out[2]                                  ;
        assign row_wr[0]        =   row_rd[1] && !row_full[0]                   ;
        assign row_in[0]        =   row_out[1]                                  ;
        assign row_rd[0]        =   !row_empty[0] && row0_rd                    ;
        assign row_rd[1]        =   !row_empty[1] && row1_rd                    ;
        assign busy             =   row_full[2]                                 ;
        assign kernel_addr      =   cnt_kernel                                  ;
        always@(posedge clk)begin
            if(conv_rst) 
                state_rd <= rd_init1;
            else
            case(state_rd)
            rd_init1:
                if(init1_init2)
                    state_rd <= rd_init2;
            rd_init2:
                if(init2_init3)
                    state_rd <= rd_init3;
            rd_init3:
                if(init3_conv)
                    state_rd <= rd_conv;
            rd_conv:
                if(conv_init1)
                    state_rd <= rd_init1;
            default:state_rd <= rd_init1;
            endcase
        end
        assign init1_init2      =   state_rd == rd_init1 && row_empty[1]                                    ;
        assign init2_init3      =   state_rd == rd_init2 && row_empty[0]                                    ;
        assign init3_conv       =   state_rd == rd_init3 && cnt_row2 == conv_wid*2 && cnt_row1 == conv_wid  ;
        assign conv_init1       =   state_rd == rd_conv &&  end_row2                                        ;
       generate 
            for(o=0;o<1;o=o+1) begin
                for(a=0;a<in_layer/1;a=a+1) begin 
                    for(i=0;i<3;i=i+1) begin 
                      for (j=0;j<3;j=j+1) begin
                        if(j == 2)begin:assignment
                          always@(posedge clk)begin
                            if(conv_rst) 
                              conv_buf[o][a][i*3+2] <= 'd0;
                            else
                            if(add_conv) 
                              conv_buf[o][a][i*3+2] <= {1'b0, row_out[i][(o*in_layer/1+a)*conv_bit+:conv_bit]}; 
                          end 
                        end
                        else begin:shift    
                          always@(posedge clk)begin
                            if(conv_rst) 
                              conv_buf[o][a][i*3+j] <= 'd0;
                            else
                            if(add_conv) 
                              conv_buf[o][a][i*3+j] <= conv_buf[o][a][i*3+j+1];  
                          end 
                        end
                      end
                    end
                end
            end
        endgenerate
        integer n1,b1,k1;
        generate 
         always@(posedge clk)begin 
            for(n1=0;n1<1;n1=n1+1)begin 
                for(b1=0;b1<in_layer/1;b1=b1+1)begin 
                    for(k1=0;k1<9;k1=k1+1)begin  
                        //kernel_val[n1*in_layer*9+b1*9+k1] = kernel[(cnt_paral*in_layer/1*9+n1*in_layer*9+b1*9+k1)*weight_bit+:weight_bit];
                        if(conv_bit != 1'b1)
                          begin:mul

                                mul_res[b1][cnt_kernel+out_layer/1*n1][k1] <= kernel_val[n1*in_layer*9+b1*9+k1] * conv_buf[cnt_paral][b1][k1]; //kernel 10位,conv8

                        end
                    end
                end
            end
          end
        endgenerate

        genvar n2,b2,k2;
   generate

         for(n2=0;n2<2;n2=n2+1)begin 
           for(b2=0;b2<in_layer/1;b2=b2+1)begin 
                for(k2=0;k2<9;k2=k2+1)begin 
                   if((n2*in_layer*9+b2*9+k2)<=(in_layer/1*1*9-1)) begin 
                      assign kernel_val[n2*in_layer*9+b2*9+k2] = kernel[(cnt_paral*in_layer/1*9+n2*in_layer*9+b2*9+k2)*weight_bit+:weight_bit];end

                end
            end
          end
    endgenerate             

        generate         
            for(c=0;c<in_layer/1;c=c+1)begin    
                for(l=0;l<out_layer;l=l+1) begin 
                  always@(posedge clk)begin
                    if(conv_rst) 
                      add_res[c][l] <= 'd0;
                    else
                      add_res[c][l] <= mul_res[c][l][0]+mul_res[c][l][1]+mul_res[c][l][2]+mul_res[c][l][3]+mul_res[c][l][4]+mul_res[c][l][5]+mul_res[c][l][6]+mul_res[c][l][7]+mul_res[c][l][8];
                  end  
                  assign conv_dout[(c*out_layer+l)*(4+weight_bit+conv_bit)+:(4+weight_bit+conv_bit)] = add_res[c][l]        ;
                end
            end
        endgenerate
        always@(*)begin
            if(state_rd == rd_init1) begin
                row0_rd = 1'b1;
                row1_rd = 1'b1;
                row2_rd = 1'b0;                
            end
            else
            if(state_rd == rd_init2) begin
                row0_rd = 1'b1;
                row1_rd = 1'b0;
                row2_rd = cnt_row2 < conv_wid;                
            end
            else
            if(state_rd == rd_init3) begin
                row0_rd = 1'b0;
                row1_rd = cnt_row1 < conv_wid; 
                row2_rd = cnt_row2 < conv_wid * 2;                
            end
            else 
            if(state_rd == rd_conv && !row_empty[2] && conv_idle) begin
                row0_rd = 1'b1;
                row1_rd = 1'b1;
                row2_rd = 1'b1;                 
            end
            else begin
                row0_rd = 1'b0;
                row1_rd = 1'b0;
                row2_rd = 1'b0;                 
            end
        end

        always@(posedge clk)begin
            if(conv_rst) 
                conv_idle <= 1'b1;
            else
            if(state_rd == rd_conv)begin
                if(cnt_conv < 2 && !kernel_vld)
                    conv_idle <= 1'b1;
                else
                if(row_rd[2] && cnt_stride_w == 0 && cnt_stride_h == 0)
                    conv_idle <= 1'b0; 
                else
                if(end_paral)
                    conv_idle <= 1'b1;
            end
        end
        always@(posedge clk)begin
          if(conv_rst)
            cnt_din <= 20'd0;
          else
          if(end_din || din_last)
            cnt_din <= 10'd0;
          else
          if(add_din)
            cnt_din <= cnt_din + 1'b1;
        end
        assign  add_din =   row_wr[2]                                           ;
        assign  end_din =   add_din && cnt_din >= conv_wid * conv_heig - 1      ;
        always @(posedge clk ) begin
            if(conv_rst)
                row1_add_flag <= 1'b0;
            else
            if(state_rd == rd_init1 && row_empty[1])
                row1_add_flag <= 1'b1;
            else
            if(end_row2)
                row1_add_flag <= 1'b0;
        end
        always @(posedge clk ) begin
            if(conv_rst)
                row0_add_flag <= 1'b0;
            else
            if(state_rd == rd_init2 && row_empty[0])
                row0_add_flag <= 1'b1;
            else
            if(end_row2)
                row0_add_flag <= 1'b0;
        end

        always@(posedge clk)begin
            if(conv_rst)
                cnt_kernel <= 'd0;
            else
           if(end_kernel)
                cnt_kernel <= 'd0;
           else
           if(add_kernel)
                cnt_kernel <= cnt_kernel + 1'b1;
        end
        assign  add_kernel  =   kernel_vld                                                  ;
        assign  end_kernel  =   add_kernel && cnt_kernel >= out_layer / 1 - 1'b1    ;
        always@(posedge clk)begin
            if(conv_rst)
                cnt_paral <= 'd0;
            else
            if(end_paral)
                cnt_paral <= 'd0;
            else
            if(add_paral)
                cnt_paral <= cnt_paral + 1'b1;
        end
        assign  add_paral   =   end_kernel                                              ;
        assign  end_paral   =   add_paral && cnt_paral >= 1 - 1             ;
        always@(posedge clk)begin
            if(conv_rst)
                cnt_conv <= 'd0;
            else
            if(end_conv)
                cnt_conv <= 'd0;
            else
            if(add_conv)
                cnt_conv <= cnt_conv + 1'b1;
        end
        assign  add_conv    =   state_rd == rd_conv && row_rd[2]                        ;
        assign  end_conv    =   add_conv && cnt_conv >= conv_wid - 1'b1                 ;
        always@(posedge clk)begin
            if(conv_rst)
                cnt_stride_h <= 'd0;
            else
            if(end_stride_h || end_row2)
                cnt_stride_h <= 'd0;
            else
            if(add_stride_h)
                cnt_stride_h <= cnt_stride_h + 1'b1;
        end
        assign  add_stride_h    =   end_conv                                             ;
        assign  end_stride_h    =   add_stride_h && cnt_stride_h >= stride - 1'b1        ;
        always@(posedge clk)begin
            if(conv_rst)
                cnt_stride_w <= 'd0;
            else
            if(end_stride_w || (cnt_conv < 2))
                cnt_stride_w <= 'd0;
            else
            if(add_stride_w)
                cnt_stride_w <= cnt_stride_w + 1'b1;
        end
        assign  add_stride_w    =   add_conv && cnt_conv >= 2                             ;
        assign  end_stride_w    =   add_stride_w && cnt_stride_w >= stride - 1            ;

        always @(posedge clk ) begin
            if(conv_rst)begin
                conv_vld <= 1'b0;
                last_vld <= 1'b0;                
                dout_vld  <= 1'b0;
                dout_last <= 1'b0;
            end
            else begin
                conv_vld  <= end_kernel;
                last_vld <= end_out;
                dout_vld <= conv_vld;
                dout_last <= last_vld;
            end
        end
        always@(posedge clk)begin
            if(conv_rst)
                cnt_out <= 'd0;
            else
           if(end_out)
                cnt_out <= 'd0;
           else
           if(add_out)
                cnt_out <= cnt_out + 1'b1;
        end
        assign  add_out =   end_paral                           ;
        assign  end_out =   add_out && cnt_out >= result_len - 1;
        always@(posedge clk)begin
            if(conv_rst) 
                kernel_vld <= 1'b0;
            else
            if(add_stride_w && cnt_stride_w == 0 && cnt_stride_h == 0)
                kernel_vld <= 1'b1;    
            else
            if(end_paral)
                kernel_vld <= 1'b0;
        end
        always@(posedge clk)begin
            if(conv_rst)
                cnt_row2 <= 10'd0;
            else
            if(end_row2)
                cnt_row2 <= 10'd0;
            else
            if(add_row2)
                cnt_row2 <= cnt_row2 + 1'b1;
        end
        assign  add_row2    =   row_rd[2]                                                       ;
        assign  end_row2    =   add_row2 && cnt_row2 >= conv_wid * conv_heig - 1                ;
        always@(posedge clk)begin
            if(conv_rst)
                cnt_row1 <= 10'd0;
            else
            if(end_row1 || end_row2)
                cnt_row1 <= 10'd0;
            else
            if(add_row1)
                cnt_row1 <= cnt_row1 + 1'b1;
        end
        assign  add_row1    =   row_rd[1] && row1_add_flag                                      ;
        assign  end_row1    =   add_row1 && cnt_row1 >= conv_wid * conv_heig - 1 - conv_wid     ;
        always@(posedge clk)begin
            if(conv_rst)
                cnt_row0 <= 10'd0;
            else
            if(end_row0 || end_row2)
                cnt_row0 <= 10'd0;
            else
            if(add_row0)
                cnt_row0 <= cnt_row0 + 1'b1;
        end
        assign  add_row0    =   row_rd[0] && row0_add_flag                                      ;
        assign  end_row0    =   add_row0 && cnt_row0 >= conv_wid * conv_heig - 1 - conv_wid*2   ;       
        generate 
            for(m=0;m<3;m=m+1)begin
            fifo row_inst(
                .clk        ( clk           ),
                .rst_n       ( ~conv_rst      ),
                .din        ( row_in[m]     ),      // input wire [10 : 0] din
                .we      ( row_wr[m]     ),  // input wire wr_en
                .re      ( row_rd[m]     ),  // input wire rd_en
                .dout       ( row_out[m]    ),    // output wire [10 : 0] dout
                .full       ( row_full[m]   ),    // output wire full
                .empty      ( row_empty[m]  )  // output wire empty
              );
            end
        endgenerate
        function integer clogb2;
          input integer depth;
            for (clogb2=0; depth>0; clogb2=clogb2+1)
              depth = depth >> 1;
        endfunction
endmodule
module fifo
#(parameter DW = 8,AW = 4)//默认数据宽度8,FIFO深度16
(
    input clk,
    input rst_n,
    input we,
    input re,
    input [DW-1:0]din,
    output reg [DW-1:0]dout,
    output empty,
    output full
    );
// internal signal
parameter Depth = 1 << AW;//depth of FIFO 
reg [DW-1:0]ram[0:Depth-1];
reg [AW:0]cnt;
reg [AW-1:0]wp;
reg [AW-1:0]rp;
// FIFO declaration
// 空满检测
assign empty = (cnt==0)?1'b1:1'b0;
assign full = (cnt==Depth)?1'b1:1'b0;
// cnt 计数
always@(posedge clk or negedge rst_n)
begin
    if(!rst_n)
        cnt <= 1'd0;
    else if(!empty & re & !full & we)//同时读写
        cnt <= cnt;
    else if(!full & we)//写
        cnt <= cnt+1;
    else if(!empty & re)//读
        cnt <= cnt-1;
    else 
        cnt <= cnt;
end
// 读指针
always@(posedge clk or negedge rst_n)
begin
    if(!rst_n)
        rp <= 1'b0;
    else if(!empty & re)
        rp <= rp+1'b1;
    else
        rp <= rp;
end
//写指针
always@(posedge clk or negedge rst_n)
begin
    if(!rst_n)
        wp <= 1'b0;
    else if(!full & we)
        wp <= wp+1'b1;
    else
        wp <= wp;
end
// 读操作
always@(posedge clk or negedge rst_n)
begin
    if(!rst_n)
        dout <= {DW{1'bz}};
    else if(!empty & re)
        dout <= ram[rp];
    else
        dout <= dout;
end
//写操作
always@(posedge clk)
begin
    if(!full & we)
        ram[wp] <= din;
    else
        ram[wp] <= ram[wp];
end
endmodule

yosys scripts:

# Yosys synthesis script for Conv_2
# Read verilog files
read_verilog -nolatches ./benchmark/Conv_2.v

# Technology mapping
hierarchy -top Conv_2
proc
techmap -D NO_LUT -map +/adff2dff.v

# Synthesis
synth -top Conv_2 -flatten
clean

# LUT mapping
abc -lut 6

# Check
synth -run check

# Clean and output blif
opt_clean -purge
write_blif Conv_2_yosys_out.blif
tangxifan commented 2 years ago

@kangliyu1

https://github.com/lnis-uofu/OpenFPGA/blob/1d4fa96d547d39f0aa27b4be0de1635e2277c80d/openfpga_flow/tasks/benchmark_sweep/iwls2005/config/task.conf#L70

kangliyu1 commented 2 years ago

@kangliyu1

  • Can you try to remove the -nolatches option?
  • Can you tell which Yosys command causes the problem? According to the log file, my guess is the synth -top Conv_2 -flatten.
  • To identify if this is a bug from yosys or our tech lib, I suggest you to try other synthesis scripts through other test cases, e.g.,

https://github.com/lnis-uofu/OpenFPGA/blob/1d4fa96d547d39f0aa27b4be0de1635e2277c80d/openfpga_flow/tasks/benchmark_sweep/iwls2005/config/task.conf#L70

  • If other scripts/tech lib can pass, it is a problem of the yosys script.
  • If other scripts have the same problem, it could be a problem of yosys. The bug should be reported to Yosys HQ: https://github.com/YosysHQ/yosys

Thank you very much for your reply! However, I still did not solve the problem yesterday. Regarding your reply to me on github, I will give you feedback

  1. It will still kill after deleting -nolatches
  2. It is true that the command synth -top Conv_2 -flatten has a problem, but I have seen it for a long time and still don't know why the problem occurred. Some will appear in the memory_map in the synth kill, and some will appear in the opt_expr.
  3. Using the ys_tmpl_yosys_vpr_bram_dsp_dff_flow.ys script will show the problem of "Run yosys run failed with returncode -9" in the TECHMAP section, that is, kill here, which feels like a yosys problem, but other codes pass in the integrated section. This error "Run yosys run failed with returncode -9" will also appear afterwards. For example, a "-9" direct interrupt (after yosys synthesis) suddenly appears in some strange places, as shown in the following pictures, but this There is no detailed debug information, so I don't know how to debug. I am really confused about this returncode -9 problem. I hope you can reply to me when you have time. Thank you very much! ! By the way, does your openfpga architecture library have an architecture that supports the blif file synthesized by vivado? Which circuit of the yosys that your openfpga brings will produce bad comprehensive results? fig1: image fig2: image fig3: image fig4: image

and so on......

tangxifan commented 2 years ago

@kangliyu1

tangxifan commented 2 years ago

@kangliyu1 Can you provide the openfpga version ? (You can see it by running OpenFPGA/openfpga/openfpga -v

kangliyu1 commented 2 years ago

@kangliyu1 Can you provide the openfpga version ? (You can see it by running OpenFPGA/openfpga/openfpga -v This is my openfpga version image

tangxifan commented 2 years ago

@kangliyu1 We have improved many features in OpenFPGA since 2021-09-22, including Yosys upgrade. Can you give a try on the latest master?

kangliyu1 commented 2 years ago

We have improved many features in OpenFPGA since 2021-09-22, including Yosys upgrade. Can you give a try on the latest master?

Okay, thank you for your reply. I will try to use the upgraded version today. Does the upgraded openfpga still have functions? (For example, can PCf files be supported for pin constraints?) Also, vivado can generate edif files, and can the corresponding vpr be run in openfpga?

tangxifan commented 2 years ago

@kangliyu1 The major update is to bump Yosys version from 0.9 to latest 0.10 release Also we introduce new configuration protocols.

PCF and EDIF support is scheduled this year. It will be online.