clash-lang / clash-compiler

Haskell to VHDL/Verilog/SystemVerilog compiler
https://clash-lang.org/
Other
1.45k stars 154 forks source link

Work duplication during netlist generation #2236

Open leonschoorl opened 2 years ago

leonschoorl commented 2 years ago

Within the core-to-core phase of clash it tries to be careful about duplicating work in the resulting HDL. Primitives are annotated with their WorkInfo and there is a whole module dedicated to figuring out whether core terms or binders contain work and avoid inlining them.

But during netlist generation we do none of that. For example the blackbox for BitVector.rotateL: https://github.com/clash-lang/clash-compiler/blob/afd4182d74001fa487cd4c03d1f80065f29225b8/clash-lib/prims/vhdl/Clash_Sized_Internal_BitVector.primitives.yaml#L556-L570 Renders the ~ARG[2] twice, even if it's a expression containing work.

topEntity :: BitVector 473 -> Int -> BitVector 473
topEntity x n = rotateL x (n * 5)

compiles to:

  result <= std_logic_vector(rotate_left(unsigned(x),to_integer((resize(x1 * to_signed(5,64),64))
      -- pragma translate_off
      mod 473
      -- pragma translate_on
      )))
      -- pragma translate_off
      when ((resize(x1 * to_signed(5,64),64)) >= 0) else (others => 'X')
      -- pragma translate_on
      ;

Duplicating the multiplication (resize(x1 * to_signed(5,64),64)).

The writer of the blackbox could fix this by using ~VAR instead of ~ARG, but from their perspective it would be great if clash could do this automatically. And clash could be smart enough to only do this when the argument contains work.

Actually implementing this seems tricky, because the current isWorkFree is built around core terms not netlist exprs. And detecting whether there are more then one occurrence of ~ARG[c] is complicated by conditional blocks (~IF).

leonschoorl commented 2 years ago

Related to this, the blackbox for BitVector.shiftL: https://github.com/clash-lang/clash-compiler/blob/afd4182d74001fa487cd4c03d1f80065f29225b8/clash-lib/prims/vhdl/Clash_Sized_Internal_BitVector.primitives.yaml#L508-L531 It contains both ~VAR[shI][2] and ~ARG[2]. Again duplicating the second argument, once in a variable/signal and once directly in the expression. Here maybe clash could possibly make the ~ARG[n] reuse the variable from the ~VAR[nm][n].

leonschoorl commented 2 years ago

@alex-mckenna just pointed out that in both these examples the second use of the argument is within a translate_off block, so the synthesized hardware doesn't contain the duplicated part

alex-mckenna commented 2 years ago

We have this workFreeBinders in the RewriteState, maybe we could keep this in netlist, and then report an warning to the user when a black box uses it's argument more than once and it's determined to do work? This might slow things down though depending how many cache misses occur when instantiating the black boxes

alex-mckenna commented 2 years ago

It could be a check limited to runs with -fclash-debug-invariants if it proves too costly for normal use