YosysHQ / nextpnr

nextpnr portable FPGA place and route tool
ISC License
1.29k stars 242 forks source link

Problem with ALU54D, "Unable to find legal placement" #1326

Closed bjoernskau closed 4 months ago

bjoernskau commented 4 months ago

Hello nextpnr team, We are currently working on a big DSP-project, (as mentioned a few weeks ago in another issue).

At the moment we are using:

Info: Device utilisation:
Info:                    VCC:     1/    1   100%
Info:                    IOB:    25/  274     9%
Info:                   LUT4:  2564/ 8640    29%
Info:                 OSER16:     0/   80     0%
Info:                 IDES16:     0/   80     0%
Info:               IOLOGICI:     0/  276     0%
Info:               IOLOGICO:     0/  276     0%
Info:              MUX2_LUT5:   391/ 4320     9%
Info:              MUX2_LUT6:    99/ 2160     4%
Info:              MUX2_LUT7:    33/ 1080     3%
Info:              MUX2_LUT8:    12/ 1080     1%
Info:                    ALU:   722/ 6480    11%
Info:                    GND:     1/    1   100%
Info:                    DFF:  1963/ 6480    30%
Info:              RAM16SDP4:     0/  270     0%
Info:                  BSRAM:     7/   26    26%
Info:                 ALU54D:     4/   10    40%
Info:        MULTADDALU18X18:     0/   10     0%
Info:           MULTALU18X18:     0/   10     0%
Info:           MULTALU36X18:     0/   10     0%
Info:              MULT36X36:     5/    5   100%
Info:              MULT18X18:     0/   20     0%
Info:                MULT9X9:     0/   40     0%
Info:                 PADD18:     0/   20     0%
Info:                  PADD9:     0/   40     0%
Info:                    GSR:     1/    1   100%
Info:                    OSC:     0/    1     0%
Info:                   rPLL:     1/    2    50%
Info:                   BUFG:     0/   22     0%

BOARD=tangnano9k FAMILY=GW1N-9C DEVICE=GW1NR-LV9QN88PC6/I5

And we get the error:

ERROR: Unable to find legal placement for cell 'ecgfilt_inst.fbcomb_cas_inst.alu54_sum_inst_1.alu54_1' after 3955079 attempts, check constraints and utilisation. Use `--placer-heap-cell-placement-timeout` to change the number of attempts.
0 warnings, 1 error
make: *** [makefile:101: /home/samuel/PARAL_Opentools/ws/paral/obj/paral_top/paral_top_pnr.json] Error 255

If we remove the: 'ecgfilt_inst.fbcomb_cas_inst.alu54_sum_inst_1.alu54_1' Then we just get the error with another of the 4 ALU54's used.

Example of a ALU54 setup from the project:

reg [53:0] acc_1a = 54'h0;
reg [53:0] acc_1b = 54'h0;
wire [53:0] acc_1c; 
reg acc_reset = 0;

alu54 alu54_bp_inst ( 
    .clk                (clk),
    .reset              (resetb),
    .a1a                (acc_1a), 
    .a1b                (acc_1b),
    .a1c                (acc_1c),
    .acc_reset          (acc_reset)
    );

And inside the alu54.v file:

`default_nettype none
module alu54 (
    input wire clk,
    input wire reset,
    input wire [53:0] a1a,
    input wire [53:0] a1b,
    output wire [53:0] a1c,
    input wire acc_reset
);

wire [54:0] caso;

ALU54D alu54_bp_inst1(
    .A      (a1a),
    .B      (a1b),
    .DOUT   (a1c), 
    .CASI   (55'b0),
    .ASIGN  (1'b1),
    .BSIGN  (1'b1),
    .ACCLOAD(acc_reset),
    .CE     (1'b1),
    .CLK    (clk),
    .RESET  (reset), 
    .CASO   (caso)
);

defparam alu54_bp_inst1.AREG = 1'b0; //1'b0:bypass mode; 1'b1: register mode
defparam alu54_bp_inst1.BREG = 1'b0;
defparam alu54_bp_inst1.ASIGN_REG = 1'b0;
defparam alu54_bp_inst1.BSIGN_REG = 1'b0;
defparam alu54_bp_inst1.ACCLOAD_REG = 1'b1;
defparam alu54_bp_inst1.OUT_REG = 1'b1;
defparam alu54_bp_inst1.B_ADD_SUB = 1'b0; //1'b0: add; 1'b1:sub;
defparam alu54_bp_inst1.C_ADD_SUB = 1'b0;
defparam alu54_bp_inst1.ALUD_MODE = 0;//0:ACC/0 +/- B +/- A; 1:ACC/0 +/- B + CASI; 2:A +/- B + CASI;
defparam alu54_bp_inst1.ALU_RESET_MODE = "SYNC";//SYNC, ASYNC
endmodule

Are we doing something wrong, or might there be a problem with using all 5 DSP-primitives at the same time?

Please let me know if I should send more code. Thank you in advance!

yrabbit commented 4 months ago

Well, you're right - this chip has 5 DSP blocks, two macros in each. Five mult36x36 seems to pick them all.

mult36x36 is made up of 4 mult18x18, the results of which are compressed and added together using alu54d, which are in these dsps, so the usage report is somewhat lying.

shot-0

bjoernskau commented 4 months ago

I think it is a bit hard to understand, but does it mean we only have 5x ALU54 and 5x MULT36x36. So if we use all 5x MULT36x36, they use the 5x ALU54, to add their results?

yrabbit commented 4 months ago

Here is one macro: two mult18x18 and alu54d

dsp-macro

two macros form one DSP block.

Mult36x36 splits the input operands into 18-bit chunks and multiplies each chunk by each. The results of the multiplications are then shifted and summed to produce a 72-bit result. Four 18x18 multipliers are used for multiplying chunks, and two ALU54Ds are used for shifting and subsequent summing.

Thus, using one Mult36x36 you occupy 4 mult18x18 and two ALU54d

If necessary, I can draw a picture of multiplication and shifts along each wire, although it will be a large picture.

bjoernskau commented 4 months ago

Ahh I see! Now it makes sense. Just got it trough using all 10 ALU54, by reducing some of the MULT36.

Thank you for the help! Just a note, but maybe you could change the "Device utilisation:" to whenever a MULT36 is used, it should add +2 on the ALU54, maybe in parentheses.

Info: Device utilisation:
Info:                    VCC:     1/    1   100%
Info:                    IOB:    25/  274     9%
Info:                   LUT4:  2396/ 8640    27%
Info:                 OSER16:     0/   80     0%
Info:                 IDES16:     0/   80     0%
Info:               IOLOGICI:     0/  276     0%
Info:               IOLOGICO:     0/  276     0%
Info:              MUX2_LUT5:   390/ 4320     9%
Info:              MUX2_LUT6:    94/ 2160     4%
Info:              MUX2_LUT7:    35/ 1080     3%
Info:              MUX2_LUT8:    14/ 1080     1%
Info:                    ALU:   722/ 6480    11%
Info:                    GND:     1/    1   100%
Info:                    DFF:  1819/ 6480    28%
Info:              RAM16SDP4:     0/  270     0%
Info:                  BSRAM:     7/   26    26%
Info:                 ALU54D:     4(10)/   10    40%       edit
Info:        MULTADDALU18X18:     0/   10     0%
Info:           MULTALU18X18:     0/   10     0%
Info:           MULTALU36X18:     0/   10     0%
Info:              MULT36X36:     3/    5    60%
Info:              MULT18X18:     0/   20     0%
Info:                MULT9X9:     0/   40     0%
Info:                 PADD18:     0/   20     0%
Info:                  PADD9:     0/   40     0%
Info:                    GSR:     1/    1   100%
Info:                    OSC:     0/    1     0%
Info:                   rPLL:     1/    2    50%
Info:                   BUFG:     0/   22     0%
yrabbit commented 4 months ago

yes that would be nice, I haven’t delved into it yet because this part is not specific to Gowin and is located somewhere at the upper general levels :)