clash-lang / clash-compiler

Haskell to VHDL/Verilog/SystemVerilog compiler
https://clash-lang.org/
Other
1.4k stars 149 forks source link

Data.Bits.popCount synthesis leads to multi-driven net pins #2407

Open kleinreact opened 1 year ago

kleinreact commented 1 year ago

We recently tried to synthesize some simple popCount extension for KnownSized BitVectors for the bittide-hardware project

popCountTo32 :: KnownNat n => BitVector n -> Signed 32
popCountTo32 = resize . bitCoerce . popCount

leading to very long Clash warnings and multi-driven net pin errors. The observed error only appears in a very specific context and, unfortunately, we were not able to reproduce it with a small standalone example yet. Therefore, we assume that its not only the popCountTo32 synthesis, but also the given interplay with other components, which leads to the final error in Clash.

We still tried to reduce it to a minimal counterexample as much as possible, given by the following code. The full code and context can be found here. The used GHC and Clash versions are fixed here.

import Clash.Prelude

import Bittide.ClockControl
import qualified Bittide.ClockControl.Callisto.Util as U

import qualified Clash.Cores.Xilinx.Floating as F
import qualified Clash.Signal.Delayed as D

data ControlSt = ControlSt
  { _x_k :: !Float
  , _z_k :: !(Signed 32)
  , _b_k :: !Bool
  } deriving (Generic, NFDataX)

{-# NOINLINE popCountTest #-}
popCountTest ::
  forall dom.
  HiddenClockResetEnable dom =>
  Signal dom (Vec 3 (DataCount 12)) ->
  Signal dom Bool
popCountTest dataCounts = D.toSignal b_kNext
 where
  -- Interestingly, changing this implementation to
  --
  -- > mask = pure $ complement 0
  --
  -- removes the error.
  mask :: Signal dom (BitVector 3)
  mask = pure $ pack $ repeat high

  state =
    D.toSignal $ ControlSt
      <$> D.delayI 0 x_k
      <*> D.delayI 0 z_k
      <*> b_kNext

  -- `x_k` is technically not used, but removing it produces an error
  -- in clash. With x_k available clash passes with a strange warning,
  -- but vivado synthesis fails afterwards.
  x_k :: DSignal dom 0 Float
  x_k = D.fromSignal (_x_k <$> state)

  z_k :: DSignal dom 0 (Signed 32)
  z_k = D.fromSignal (_z_k <$> state)

  -- `sumTo32` seems to be essential for producing the
  -- error. Inlineing it produces the same error in clash as described
  -- in the comment above. Moving the call outside the library removes
  -- the error completely.
  r_k :: DSignal dom F.FromS32DefDelay Float
  r_k = F.fromS32 $ D.fromSignal $ (U.sumTo32 <$> dataCounts) - (popCountTo32 <$> mask)

  c_des :: DSignal dom (F.FromS32DefDelay + 2*F.MulDefDelay + 2*F.AddDefDelay) Float
  c_des = D.delayI 0 r_k

  c_est :: DSignal dom (F.FromS32DefDelay + 2*F.MulDefDelay + 2*F.AddDefDelay) Float
  c_est = D.delayI 0 $ F.fromS32 z_k

  b_kNext = (== F.LT) <$> F.compare c_des c_est

-- The generalization to n-sized bit vectors is needed for reproducing
-- the error, i.e., there is no error if `popCoutnTo32` is specialized
-- to bit vectors of size 3. Note that moving the definition to the
-- where clause of `popCountTest` has the same effect.
popCountTo32 :: KnownNat n => BitVector n -> Signed 32
popCountTo32 = resize . bitCoerce . popCount

The comments should give some insights of what we already tired. The issue has been analyzed in this Bittide PR. The clash compiler output with DebugApplied enabled is attached: DebugAppliedLog.zip.

We can run a more dedicated analysis, but this requires some more detailed analysis strategy than what we have at the moment.

leonschoorl commented 1 year ago

I tried to reproduce this locally, but failed in two different ways so far:

alex-mckenna commented 1 year ago

assignmentWith: Cannot assign as Proc Blocking after Cont

Hey, this error picking up actual wrong netlist generation instead of me messing up while working on a netlist PR :tada:

leonschoorl commented 1 year ago

Something somewhere is generating duplicate signal names.

So far I've shrunk it to:

{-# OPTIONS_GHC -XPartialTypeSignatures #-}
module Test2407 where

import Clash.Prelude

import qualified Clash.Cores.Xilinx.Floating as F
import qualified Clash.Signal.Delayed as D

topEntity clk rst ena = withClockResetEnable @System clk rst ena $ popCountTest @System

popCountTest ::
  forall dom.
  HiddenClockResetEnable dom =>
  _
popCountTest = r_k
 where
  r_k :: DSignal dom F.FromS32DefDelay Float
  r_k = F.fromS32 $ D.fromSignal (popCountTo32 <$> mask)

  mask :: Signal dom (BitVector 3)
  mask = pure $ pack $ repeat high  -- broken
  -- mask = pure $ complement 0  -- not broken

popCountTo32 :: KnownNat n => BitVector n -> Signed 32
popCountTo32 = resize . bitCoerce . popCount

Which generates:

module topEntity
    (...);
  [...]
  wire signed [63:0] c$app_arg;
  wire signed [63:0] c$app_arg;
  wire signed [63:0] c$app_arg;
  wire signed [63:0] c$app_arg;
  [...]
  assign c$app_arg = $unsigned({{(64-1) {1'b0}},c$bv});
  assign c$app_arg = $unsigned({{(64-1) {1'b0}},c$bv_0});
  assign c$app_arg = $unsigned({{(64-1) {1'b0}},c$bv_1});
  assign c$app_arg = $unsigned({{(64-1) {1'b0}},c$bv_2});
  [...]

All of there c$app_arg are different signals, but they've been assigned the same name in Verilog

alex-mckenna commented 1 year ago

One thing you can try is adding a uniqueness check on the function which unsafely creates a new netdecl and get the Callstack to the second c$app_arg which is declared

martijnbastiaan commented 1 year ago

I can't reproduce this at all. What GHC / Clash commit are you guys on?

christiaanb commented 1 year ago

For those hunting down a reproducer: does commenting out https://github.com/clash-lang/clash-compiler/blob/8894dd63aae5522845b7e37ab3c0d6cf10a084a8/clash-lib/src/Clash/Normalize/Strategy.hs#L55 solve the issue?

alex-mckenna commented 1 year ago

I can reproduce @leonschoorl's last example in Verilog on current master with GHC 9.2.4. Trying @christiaanb's suggestion does indeed appear to fix this - the resulting Verilog code looks very similar to the working VHDL code, i.e.

alex-mckenna commented 1 year ago

But also: what the heck Christiaan

alex-mckenna commented 1 year ago

Ah, maybe I see. Is this something to do with the evaluator rules for xToBV not being able to fire because the BitVector argument to it becomes a pack# of a Vec which is an unsafeCoerce# (due to it being a map that turns a Bit to a BitVector 1 which is replaced with unsafeCoerce#)?

EDIT: I think so, in the fixed version we see the BitVectors are chomped down to constants and the xToBV calls are all removed in flattenExpr.

rowanG077 commented 1 year ago

Everyone running around, panic, papers flying. A beam of light shines down from the heavens. Everyone is silent. Suddenly a deep voice cuts through the ignorance.

For those hunting down a reproducer: does commenting out

https://github.com/clash-lang/clash-compiler/blob/8894dd63aae5522845b7e37ab3c0d6cf10a084a8/clash-lib/src/Clash/Normalize/Strategy.hs#L55

solve the issue?

And so it shall be. People are weeping. Churches are being erected. The world is not as it was before.

alex-mckenna commented 1 year ago

Come for the nerd sniping, stay for the quasi-religious fan-fiction

christiaanb commented 1 year ago

To be fair, I looked at the log earlier, and was triggered by the GHC.Prim.unsafeCoerce# as none of the code originally contained calls to unsafeCoerce; and collapseRHSNoops was the only function that introduces them.

christiaanb commented 1 year ago

So there's two possibilities:

  1. collapseRHSNoops is bugged
  2. collapseRHSNoops exposes a bug in netlistgen, i.e. disabling the transformation is just a red herring.