clash-lang / clash-compiler

Haskell to VHDL/Verilog/SystemVerilog compiler
https://clash-lang.org/
Other
1.4k stars 149 forks source link

vioProbe uses its clock lazily #2551

Closed christiaanb closed 11 months ago

christiaanb commented 1 year ago

Fixes #2532

So #2532 shows up when we have a primitive that doesn't actually use its clock argument and we use a bang-pattern on an otherwise unused argument. If we look at the demand signature of vioProbe# before the patch we see:

vioProbe#: <A><1L><1A><1A><1A><1A>
vioProbe# ::
  forall dom a o n m.
  (KnownDomain dom, VIO dom a o) =>
  Vec n String ->
  Vec m String ->
  o ->
  Clock dom ->
  a
vioProbe# !_inputNames !_outputNames !_initialOutputProbeValues !_clk = ...

where

L   lazy. As far as the analysis could tell, the argument isn’t demanded
A   absent == definitely unused. Indicates that the binder is certainly not used.
1*...   used once. Not a usage but rather modifies a “used” demand denoting that the argument is used precisely once.

Because the _clk argument is both definitely evaluated and used once so there is no need to share. This in turn makes GHC want to inline the _clk argument in applications of vioProbe#.

After the patch we get the following demand signature:

vioProbe#: <A><1L><1A><1A><1A><L>
vioProbe# ::
  forall dom a o n m.
  (KnownDomain dom, VIO dom a o) =>
  Vec n String ->
  Vec m String ->
  o ->
  Clock dom ->
  a
vioProbe# !_inputNames !_outputNames !_initialOutputProbeValues clk = seq (lazy clk) ...

Now the clk argument is marked as evaluated lazily, and can possibly be shared. This in turn makes GHC not want to inline the clk argument in applications of vioProbe#.

If we decide to merge this PR we should do two things:

  1. Check all of our other primitives to see whether they use bang-patterns for unused clock arguments, and if they do, change them to use the method in this PR
  2. Open up a new issue mentioning that Clash should try to undo situations like #2532 in the general case.
christiaanb commented 1 year ago

With this patch, for:

module Test where

import Clash.Explicit.Prelude
import Clash.Cores.Xilinx.VIO
import Clash.Xilinx.ClockGen

topEntity :: Clock System -> Reset System -> Vec 4 (Signal System Bit) -> Vec 4 (Signal System Bit)
topEntity clkIn rst inp = out
 where
  (clk,_) = clockWizard @System @System (SSymbol @"clk_wiz_0") clkIn rst

  out = map (vioProbe ("vio_in":>Nil) ("vio_out":>Nil) 0 clk) inp

we get the following Verilog:

/* AUTOMATICALLY GENERATED VERILOG-2001 SOURCE CODE.
** GENERATED BY CLASH 1.7.0. DO NOT MODIFY.
*/
`default_nettype none
`timescale 100fs/100fs
module topEntity
    ( // Inputs
      input wire  clkIn // clock
    , input wire  rst // reset
    , input wire  inp_0
    , input wire  inp_1
    , input wire  inp_2
    , input wire  inp_3

      // Outputs
    , output wire  result_0
    , output wire  result_1
    , output wire  result_2
    , output wire  result_3
    );
  wire [1:0] c$case_scrut;
  wire [3:0] inp;
  wire [3:0] result;

  assign inp = {inp_0,   inp_1,   inp_2,
                inp_3};

  // clockWizard begin
  clk_wiz_0 clockWizard_inst
  (.clk_in1  (clkIn)
  ,.reset    (  rst)
  ,.clk_out1 (c$case_scrut[1])
  ,.locked   (c$case_scrut[0]));
  // clockWizard end

  // map begin
  genvar i;
  generate
  for (i=0; i < 4; i = i + 1) begin : map
    wire  map_in;
    assign map_in = inp[i*1+:1];
    wire  map_out;
    wire  c$arg0;
    assign c$arg0 = c$case_scrut[1:1];

    topEntity3 topEntity3_0
      ( .result (map_out)
      , .c$arg (c$arg0)
      , .c$arg_0 (map_in) );

    assign result[i*1+:1] = map_out;
  end
  endgenerate
  // map end

  assign result_0 = result[3:3];

  assign result_1 = result[2:2];

  assign result_2 = result[1:1];

  assign result_3 = result[0:0];

endmodule

i.e. there is only one clockWizard.

martijnbastiaan commented 11 months ago

Can we capture this in a helper function? Maybe something like:

vioProbe# (markUsed -> !_inputNames) (markUsed -> !_outputNames) (markused -> !_initialOutputProbeValues) !(markUsed -> _clk) = ...

My worry is that it is pretty hard to apply this workaround consistently. We should have a simple, consistent way we can apply everywhere to every argument.

christiaanb commented 11 months ago

You can use lazy in a viewpattern. i.e. this would be a valid fix as well:

vioProbe# !_inputNames !_outputNames !_initialOutputProbeValues (lazy -> !_clk) =
martijnbastiaan commented 11 months ago

Okay makes sense. In that case: does an alias (e.g., markUsed = lazy) that has the documentation explaining that this prevents GHC from doing the "wrong" thing make sense to you?

christiaanb commented 11 months ago

Okay makes sense. In that case: does an alias (e.g., markUsed = lazy) that has the documentation explaining that this prevents GHC from doing the "wrong" thing make sense to you?

We use the bang-pattern/seq to mark the argument as "used", the use of lazy is there if you want the argument shared maximally. On FPGAs, we very much want to share any logic on clock-lines, since those resources are scarse. For many others arguments we would actually benefit if they're already inlined by GHC (e.g. the prim configuration arguments, enableGen, etc.).

I think we should just add this use of GHC.Magic.lazy to the documentation of creating your own primitives.