clash-lang / clash-compiler

Haskell to VHDL/Verilog/SystemVerilog compiler
https://clash-lang.org/
Other
1.44k stars 153 forks source link

Block rams not inferred with Vivado 2015.2 with user defined types #113

Closed adamwalker closed 8 years ago

adamwalker commented 8 years ago

Vivado 2015.2 fails to infer block ram when using the blockRamPow2 function when the type of element being stored is a user defined type. See:

https://gist.github.com/adamwalker/0b39f86bdb4925e918cf Specifically, in the first case synthesis (in Vivado) runs forever which I presume is because it is synthesizing the entire array.

Perhaps it is not the intention to allow arbitrary types to be stored in block ram, but it is allowed by the type system and I couldn't find any documentation stating that you shouldn't do that.

Thanks Adam

christiaanb commented 8 years ago

It is supposed to work for arbitrary types, and I'm quite sure it works with Altera's Quartus tool. Still, I want to make sure it will work with Xilinx' tools also. I know how to fix this: I have to update the code-generator to only store bitvectors (std_logic_vector in VHDL) in blockRAMs.

Anyhow, as a work-around, I see two options:

adamwalker commented 8 years ago

I wouldn't be suprised if it works with Xilinx ISE. From what I understand, Vivado is a complete rewrite of ISE that is still lacking features and has bugs that do not exist in ISE.

ggreif commented 8 years ago

@adamwalker If you find out that is a vivado bug, please make sure that it gets reported! Thanks!

adamwalker commented 8 years ago

Haha. I'm guessing you haven't tried to report a bug to Xilinx before. It will be ignored :)

ggreif commented 8 years ago

I am just running synthesis with Vivado 2015.3. It does seem to loop. Simulation was fast, though.

christiaanb commented 8 years ago

I just tried with Altera Quartus Prime 15.1.0, I can go as high as a 2^14 number of elements in the blockRam (e.g. a write and read addres of Unsigned 14). Higher than that, and Quartus reports it ran out of memory. Indeed, for Unsigned 14 it was already using 6 GB (on my 8 GB machine).

The VHDL that I'm generating is almost an exact copy of both the Xilinx and Altera recommended VHDL for blockRam inference. I guess for blockRams with really large address spaces, you are forced to use something like Xilinx' Coregen.

I don't think there's an easy fix for the out-of-memory problem. Aside from generating Xilinx/Altera specific blockRam VHDL, instead of the current vendor-independent VHDL.