verilog-to-routing / vtr-verilog-to-routing

Verilog to Routing -- Open Source CAD Flow for FPGA Research
https://verilogtorouting.org
Other
1k stars 386 forks source link

Difference in behavioral memory inference between ODIN and ODIN+Yosys #2131

Open aman26kbm opened 2 years ago

aman26kbm commented 2 years ago

Expected Behaviour

When a BRAM/memory is inferred from behavioral Verilog code, both flows - ODIN only and ODIN+Yosys - should result in the same hardware.

Current Behaviour

There are differences in the hardware inferred by both flows.

The three behavioral memory codes tried: Code 1: always @ (posedge clk) begin if (wren) begin ram[address] <= data; end end assign out = ram[address];

//Technically Code 1 is not a memory. The output changes asynchronously with change in address

Code 2: always @ (posedge clk) begin if (wren) begin ram[address] <= data; end out <= ram[address]; end

//Code 2 is normal memory. The address input is sampled at a clock and the output is not registered

Code 3: always @ (posedge clk) begin if (wren) begin ram[address] <= data; end out_temp <= ram[address]; out <= out_temp; end

//Code 2 is a memory with registered outputs

The arch file used for these experiments was: https://github.com/verilog-to-routing/vtr-verilog-to-routing/blob/master/vtr_flow/arch/COFFE_22nm/k6FracN10LB_mem20K_complexDSP_customSB_22nm.xml But I think the same behavior will show with other arch files as well.

Here are the observations: image

For code 1, ODIN maps everything in the memory to CLBs (i.e. it doesn't infer a memory, but infers a lot of flops), if we have an arch file in which the memory pb_type has output registers.

When code 1 and 2 are run with Quartus, we see: Code 1: 0 BRAMs 8192 REGs Code 2: 1 BRAMs 0 REGs

Possible Solution

Steps to Reproduce

  1. Create simple designs that have only a RAM in them with the code above
  2. Use any arch xml (flagship arch or the one linked above)
  3. Run VTR with two frontends: ODIN only and ODIN+Yosys

Context

This could be one of the reasons why we see difference in resource usage between ODIN-only and ODIN+Yosys flows on Koios benchmarks.

Your Environment

sdamghan commented 2 years ago

Thanks @aman26kbm @alirezazd - I have been relatively overloaded with unmerged PRs and other issues, please have a look at this issue until we have an in-person discussion about it.

alirezazd commented 2 years ago

@sdamghan Sorry for the delay. It's been a busy week. Sure, I'll take a look.

sdamghan commented 2 years ago

@alirezazd - is there any update on this thread?

alirezazd commented 2 years ago

@sdamghan Yes, I'm analyzing the designs and writing a design test bench to compare these simple designs. It is almost completed and I'll inform you of my results soon. Meanwhile, I want to ask you a question, have you tried the expose -evert-dff command and got the correct results (not considering the extra number of pins added to design IO)?

sdamghan commented 2 years ago

@alirezazd - Aman already generated the result, which is outlined in the table in the issue description. The expose command was a solution for the issue of the clstm benchmark; this issue is about how Odin-II and Yosys+Odin-II infer behavioural memory (implicit memory instantiation). For code style 1, Odin-II infers a dual port ram memory hard block, which is unexpected behaviour. You would need to dig into the netlist_create_from_ast.cpp file, running code 1 for Odin-II and see from which point it infers the logic as DPRAM and fix it.

alirezazd commented 2 years ago

@sdamghan Thanks, I'm going to check it. I will also continue writing my own test bench since I'm going to need it for debugging anyway.